Announcing Crashlytics for iOS 2.0
At Crashlytics, one of our founding principals has been an extreme (some would say, absurd) attention to detail. Crash detection and reporting, particularly on iOS, is a complex and esoteric problem to solve, with arcane restrictions that throw modern programming practices out the window.

Need to allocate memory at crash-time? Revisit your approach. Thinking of calling an Objective-C method? Dream on.

This focus has not gone unnoticed: many of the world’s best mobile engineering teams – that build many of the most-well-known apps – now trust our award-winning Crashlytics for iOS solution to deliver accurate, detailed, crash reports from hundreds of millions of devices around the globe.

But we’re not satisfied.

Over the past 6 months, we’ve embarked on a ground-up rewrite of our iOS SDK to take things to a whole new level and, after 2 months of intense testing, I’m extremely pleased to publicly announce the release of Crashlytics for iOS v2.

Highlights from hundreds of improvements

When we set out to design our new iOS SDK, it was the perfect opportunity to fundamentally rethink our approach. We’ve made hundreds of improvements to our iOS SDK that have lead to significant performance and stability increases. Here are some of the major ones:

  1. Mach Exceptions. Better than signal handlers. 

    All widely-used crash-reporting solutions for iOS and Mac OS are currently based off signals and uncaught exceptions. By registering handlers for both of these events, it’s possible to detect and inspect the majority of crashes that occur. As our usage has exploded, however, it became painfully obvious that crashes were sneaking through. For example, it’s not possible to catch all stack-overflow crashes with a signal handler.

    Fortunately, there’s a better way. In Darwin, signals are actually implemented on top of lower-level events called Mach Exceptions. Handling these directly is the holy-grail – all crashes can be captured immediately after they happen with far more precision and accuracy. The Mach Exception API is radically more complex than signal handling, but capturing every crash more than justifies the hurdles.

  2. Advanced techniques to stop secondary crashes.

    Processes that crash often end up sustaining considerable damage before the kernel takes action to terminate them. In many of the nastiest crashes we’ve seen, this can result in secondary crashes, where the crash-handling code itself is unable to operate correctly and fails, obscuring the source of the original crash. Secondary crashes have two primary causes, both due to corruption. A buffer-overrun could mangle or destroy the in-memory data-structures that Crashlytics uses to track state. Alternatively, hardware failures or disk errors could damage the temporary cache files used to record data before it was sent to our servers.

    Our new SDK goes to great lengths to address these scenarios. By carefully controlling its memory usage, our new SDK is able to pre-allocate a contiguous block of RAM that it then surrounds with guard pages, protecting against buffer-overruns. In the case of cache corruption, we’ve invested in making our file-handling code extremely defensive, so parsing cache files can’t crash unexpectedly.

  3. Stack unwinding. Finding the real path to the crash.

    One of the most abstruse aspects of crash detection is stack unwinding, the seemingly omniscient ability to determine historic code execution that directly lead to the crash. In practice, it involves carefully walking up the stack in memory and searching for return addresses – the instruction pointers of the calling lines.

    Writing an ARM stack unwinder that works in most cases is relatively straightforward – the stack layout for iOS is well-defined. However, things start to fall apart when custom assembly is thrown into the mix, as there are no hard rules on what can and can’t be done. It just so happens that objc_msgSend is such a method, performing countless tricks to make dynamic method invocation in Objective-C as fast as possible. All works perfectly during normal operation, but if objc_msgSend crashes, a naïve ARM unwinder could easily miss the stack frame of the calling function. Of course, that’s the critical line you need to know!

    Our new SDK uses a vastly better technique to determine the calling instruction that works in the case of objc_msgSend and many other “creative” methods that still conform in-part to Apple’s iOS ABI.

We could not be happier to get these improvements into the hands of thousands of app developers and the feedback so far has been fantastic. Thanks to all those who have helped us test Crashlytics for iOS v2 and trust me – we’re not stopping here – the most advanced crash reporting SDK for iOS will keep getting better!

Have questions about Crashlytics for iOS v2? Send us email!


View More

logs
Since our launch one year ago, Crashlytics has set the bar for the most informative crash reports on mobile. Above and beyond stack traces, RAM usage, and disk utilization, we’ve sought to provide all the critical data-points that developers need to pinpoint and fix issues – device orientation, battery state, even whether the device was being held up to the ear! And we’re never satisfied.

A treasure-trove of data lies in an app’s logs and there’s no better way to debug a problem than by knowing exactly what happened leading up to the critical moment. Capturing logging data has been our number-one customer request for months and our number-one concern. We care deeply about security and end-user privacy: collecting logging data opens the door to substantial risks. We wanted to begin the path down the road to building a Splunk for Mobile.

I’m excited to announce that after focusing our R&D efforts, we think we’ve cracked it, and I wanted to share some details on our approach.

Privacy, Performance

The easiest way to deliver logging would be to capture and redirect all output from NSLog(), but this is also the easiest way to infringe user privacy. Many apps don’t take the care they should in scrubbing log lines of personally-identifiable information: names, email addresses, even passwords often appear in URLs or internal settings that might commonly get logged. Sending this data, even encrypted over SSL, would be dangerous and in-breach of most privacy policies.

Instead, we’ve chosen to introduce completely distinct logging functionality called CLSLog(), so it’s explicit what data will be collected and transmitted with Crashlytics reports.

We also took the opportunity to make some performance improvements – in our benchmarks, CLSLog() is 10X faster than NSLog() under the same conditions. Using CLSLog() could not be easier – it’s a drop-in replacement:

1 OBJC_EXTERN void CLSLog(NSString *format, ...); // Log messages to be sent with crash reports
1 NSLog(@"Detected Higgs Boson with mass %f!!", [boson mass]);
2 CLSLog(@"Detected Higgs Boson with mass %f!!", [boson mass]);

Options, Options, Options

Of course, in many case you might want your log messages to also output to the system log, or show up in Xcode’s console. For these cases, we’ve also provided CLSNSLog(), which records the output and then passes it along to NSLog():

1 OBJC_EXTERN void CLSNSLog(NSString *format, ...); // Log messages to be sent with crash reports as well as to NSLog()

But what if both could happen? In development builds, it would be ideal for everything to pass-thru to Xcode’s console so debugging was as easy as possible. In release builds, though, that’s nothing but overhead — it would be great to take advantage of the blinding speed of our 100% in-memory implementation of CLSLog().

We’ve got you covered:

 1 /**
 2 *
 3 * The CLS_LOG macro provides as easy way to gather more information in your log messages that are
 4 * sent with your crash data. CLS_LOG prepends your custom log message with the function name and
 5 * line number where the macro was used. If your app was built with the DEBUG preprocessor macro
 6 * defined CLS_LOG uses the CLSNSLog function which forwards your log message to NSLog and CLSLog.
 7 * If the DEBUG preprocessor macro is not defined CLS_LOG uses CLSLog only, for a ~10X speed-up.
 8 *
 9 * Example output:
10 * -[AppDelegate login:] line 134 $ login start
11 *
12 **/
13 #ifdef DEBUG
14 #define CLS_LOG(__FORMAT__, ...) CLSNSLog((@"%s line %d $ " __FORMAT__), __PRETTY_FUNCTION__, __LINE__, ##__VA_ARGS__)
15 #else
16 #define CLS_LOG(__FORMAT__, ...) CLSLog((@"%s line %d $ " __FORMAT__), __PRETTY_FUNCTION__, __LINE__, ##__VA_ARGS__)
17 #endif

In Debug builds, CLS_LOG() will pass-thru to NSLog, but in Release builds, it will be as fast as possible:

1 CLS_LOG(@"Higgs-Boson detected! Bailing out... %@", attributesDict);

Network Efficient

We’ve designed our custom logging functionality from the ground-up to respect your end-users network connections and your app’s performance. Since it’s implementation is entirely in-process, it’s blazingly fast with no IPC overhead. It also accepts as much data as you choose to throw at it: CLSLog() maintains an auto-scrolling 64kb buffer of your log data, which is more than enough to record what happened in the moments leading up to a crash without exploding your app’s memory requirements or your end-users cellular data plan. Believe it or not, it’s even more memory-efficient than it sounds – our advanced architecture doesn’t even require holding all 64kb in RAM!

That’s Not All…

Viewing logging information is a whole other story. Rather than explain it, I’d encourage you to head over to our SDK Overview and see for yourself! We’re hard at work on additional SDK functionality and have much more to talk about in the coming weeks – stay tuned!

 


View More

TL;DR: 31 lines of Rack middleware leverage Redis for highly-performant and flexible response caching.

As Crashlytics has scaled, we’ve always been on the lookout for ways to drastically reduce the load on our systems. We recently brought production Redis servers online for some basic analytics tracking and we’ve been extremely pleased with their performance and stability. This weekend, it was time to give them something a bit more load-intensive to chew on.

The vast majority – roughly 90% – of inbound traffic to our servers is destined for the same place. Our client-side SDK, embedded in apps on hundreds of millions of devices worldwide, periodically loads configuration settings that power many of our advanced features. These settings vary by app and app version, but are otherwise identical across devices – a prime candidate for caching.

There are countless built-in and third-party techniques for Rails caching, but we sought something simple that could leverage the infrastructure we already had. Wouldn’t it be great if we could specify a cache duration in any Rails action and it would “just work”?

1 cache_response_for 10.minutes

Rack Middleware to the Rescue

One of the most powerful features of Rack-based Rails is middleware – functionality you can inject into the request processing logic to adjust how it is handled. This will let us check Redis for a cached response or fall-through to the standard Rails action.

 1 class RackRedisCache
 2   def initialize(rails)
 3     @rails = rails
 4   end
 5
 6   def call(env)
 7     cache_key = "rack::redis-cache::#{env['ORIGINAL_FULLPATH']}"
 8
 9     data = REDIS.hgetall(cache_key)
10     if data['status'] && data['body']
11       Rails.logger.info "Completed #{data['status'].to_i} from Redis cache"
12       [data['status'].to_i, JSON.parse(data['headers']), [data['body']]]
13     else
14       @rails.call(env).tap do |response|
15         response_status, response_headers, response_body = *response
16         response_cache_duration = response_headers.delete('Rack-Cache-Response-For').to_i
17
18         if response_cache_duration > 0
19           REDIS.hmset(cache_key,
20             'status', response_status,
21             'headers', response_headers.to_json,
22             'body', response_body.body
23           )
24
25           REDIS.expire(cache_key, response_cache_duration)
26           Rails.logger.info "Cached response to Redis for #{response_cache_duration} seconds."
27         end
28       end
29     end
30   end
31 end

A response in Rails consists of 3 components – the HTTP status, HTTP headers, and of course, the response body. For clarity, we store these under separate keys within a Hash in Redis, JSON-encoding the headers to convert them into a string.

If the cache key is not present, the middleware falls-through to calling the action, and then checking an internal header value to determine whether the action desires its response be cached. The final critical line leverages Redis’ key expiration functionality to ensure the cache is only valid for a given amount of time. It couldn’t get much simpler.

Implementing our DSL

To tie it all together, the ApplicationController needs a simple implementation of cache_response_for that sets the header appropriately:

1 def cache_response_for(duration)
2   headers['Rack-Cache-Response-For'] = duration
3 end

Boom. It was really that easy.

Impact?

This implementation took us only about an hour to develop and deploy, and the effects were immediate. Only 4% of these requests now fall-through to Rails, CPU usage on our API servers has plummeted, and total queries to our MongoDB cluster are down 78%. An hour well-spent. Our Redis cluster also doesn’t sweat its increased responsibility: its CPU usage is up just marginally!

Join Us!

Interested in working on these and other high-scale challenges?  We’re hiring!  Give us a shout at jobs@crashlytics.com. You can stay up to date with all our progress on Twitter, and Facebook.

 


View More

Since starting Crashlytics just over one year ago with my co-founder Wayne Chang, our mission has been clear: build tools that mobile developers love. Now, 14 months later, we’re both humbled and honored to be actively working with many of world’s top mobile apps – Square, Yelp, Groupon, Yammer, PayPal, OpenTable, Waze, HBO, Kayak, Orbitz, Hipmunk, Viddy, Socialcam, and thousands of other organizations.

Today, we are excited to announce that Aaron Levie, the CEO at Box, has joined Crashlytics’ Advisory Board. Levie, named by Inc. as one of the Top 30 Entrepreneurs Under 30, has grown Box into an international powerhouse that now serves 92% of the Fortune 500.

Having raised over $284 million for Box, Levie is building the company for the long-term, to revolutionize how businesses collaborate and share content in the global marketplace.

“The world of mobile has changed dramatically in just the past couple of years,” says Levie, “and with over 5 billion smartphones expected to be in use by 2016, nothing is slowing down.  The scale of this market shift radically changes how all applications are built, and requires a new set of tools for the Post-PC development process.  Crashlytics sits at the center of this critical part of the ecosystem.”

The state of mobile development is still far from where it needs to be and we look forward to working with Aaron as we hone the next generation of mobile tools.

“Jeff and Wayne are world-class entrepreneurs,” Levie concludes, “and undoubtedly are building an incredibly strong foundation to bring these tools to the masses.”

Follow Aaron Levie at @levie, Wayne Chang at @Wayne, and Jeff Seibert at @jeffseibert.

 


View More

They say that a team is only as good as its weakest member.  This anecdote is especially true for startups, where everyone must execute at 110% and trust is paramount.  In our mission to bring you the world’s most powerful crash reporting solution, we’ve worked tirelessly to build a top notch team across all areas of the company.

Earlier this spring, we decided to bring on three summer interns to assist the full-time team in helping Crashlytics grow. Our interns have been tackling business, front-end UI, and back-end development projects with impressive results. Although their time here isn’t over quite yet, we’d like to show our gratitude for their hard work, share a little bit about our all-star interns and reveal what it’s like to be a part of Crashlytics!

Will Whitney – MIT

Will is a rising senior studying Computer Science at MIT.  Outside of class, he’s the director of StartLabs, a nonprofit which helps students start companies. Last year, he hosted a day-long series of talks by startup founders called “Startup Bootcamp”.  It was the largest startup event ever at MIT, and featured talks by speakers such as Drew Houston, Naveen Selvadurai, and Patrick Collison.  In his spare time, he masters programming languages and reads entirely too much science fiction.

Q: What part of working at Crashlytics do you enjoy the most?

Will: The best part about working here is the focus on making unmatched user experiences. Everyone is incredibly determined to not only produce really solid data, but display it in a beautiful, usable way. Even though the UI appears simple, a ton of work has gone into making it exactly right.

Q: Favorite coding tips or tricks?

Will: One of my favorite tricks is running “twistd -n ftp” from the command line. Twisted is a Python framework for event-driven web hosting, and it’s preinstalled on most Unix platforms. This command starts a new Twisted FTP server that hosts the current directory. I love being able to drop a live FTP server anywhere really fast – it really comes in handy when you’re doing a little development or hacking around on a remote machine.

Alden Keefe Sampson – Tufts

Alden describes himself as a happy hacker working on highly scalable web systems.  At 12 years old, he taught himself his first programming language with a book from the library. Professionally, he’s been the lead developer on an augmented reality mobile app as well as a video animation web app. Alden launched Tufts University’s first hackathon with the help of technology leaders around Boston (including Crashlytics!).  He has represented the US at the Junior World Championships for Sprint Canoeing in the Czech Republic, and captained his high school cross country team. In his spare time he loves cooking, photography, and riding his shiny blue bike.

Q: Why did you want to work at Crashlytics?

Alden: I was looking for a company who is solving a challenging real world problem with a high performing team and innovative technology. Crashlytics fit the bill perfectly!  The need for clear and actionable data on your app’s performance is large and growing.  We’re tackling it with talented people and cutting edge technology. I also wanted to gain more experience in the startup world.   Jeff and Wayne have proven they know how to run a successful business — having the opportunity to learn from them is second to none.

Q: Favorite coding tips or tricks?

Alden: I often find myself a few directories deep in a git repository, wishing I was in the root. So I created this alias which will jump you directly to the root of your present git project.

alias cdu=’cd $(git rev-parse –show-toplevel)’

Add that to your .bashrc or .bash_profile and you’ll be home in no time. I chose cdu as the alias–think Change Directory Up–because it’s fast to type. Customize to your liking.

Zach Ringer – Babson College

Zach is an up-and-coming senior at Babson College, concentrating in finance and entrepreneurship. Aside from classwork, he invests most of his spare time playing on the Men’s Varsity Tennis Team, participating in the Alpha Epsilon Pi Fraternity, and pursuing entrepreneurial ventures of his own. In the past, Zach successfully directed a tennis racquet repair business, wrote automated trading algorithms for the FOREX markets and fine-tuned his knowledge as a financial intern at Merrill Lynch. His passions include sports, boating, watches, reading, and learning as much as humanly possible.

QWhy did you want to work at Crashlytics?

Zach: I chose to work at Crashlytics to strengthen my understanding of the strategies employed at a venture backed company, and how the roles of investors influence these strategies . In addition, software is a very innovative industry that I had not yet had the pleasure of delving into.  I was very intrigued to see how the strategies used in software can be intertwined within other industries and vice versa.

Q: Favorite business quote?

Zach: “You will never be successful until you don’t need a dime to do what you do.”

Hats off to one incredible summer, with an incredible team of interns!

Interested in working with Will, Alden, Zach, and the rest of our team?  We’re hiring!  Give us a shout at jobs@crashlytics.com. You can stay up to date with all our progress on TwitterFacebook, and Google+.

 


View More