Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn
Brian Fernandes
Director of Customer Engagement - Loves technology and almost everything related to computing. Wants to help you write better software. Follow at @brianfernandes.
Posted on Aug 1st 2019

Let’s face it, performance issues are the absolute worst; we know you can write a thousand lines of code before the next JavaScript framework is released, but who wants an IDE that can’t keep up with all that speed, amirite?

TL;DR:

We found and fixed a performance issue that had been affecting CodeMix users for a few months. If you ever experienced sluggishness when using the IDE, or high CPU utilization, ensure you update to version 2019.7.15 or later, and you should be good to go.

The Problem

A couple of months ago, we started receiving a few reports of performance issues with CodeMix. Most reports were about CPU usage being high, sometimes when the systems were supposed to be doing nothing, others when it would spike during use and then mysteriously go down, or recover after a restart. Rarer, but still reported, were issues where the use of the IDE was affected, with sluggish typing, or delays when content assist was invoked.

Obviously, nobody in our company was happy about this. Most affected, of course, were our dev and QA teams – why can’t we figure this out, and why aren’t these showing up in our tests?

A positive side-effect of not being able to immediately fix this issue, was that hunting for it helped us find and fix issues like overzealous extensions, mismanaged markers, redundant Eclipse functionality, etc. With these fixes released, performance related reports reduced, but still did not disappear. Now, how do we get these reports? Let’s take a quick look at our support process, and how it was key in finding and fixing this issue.

Support at Genuitec

Since we released MyEclipse, way back in 2003, we’ve given all our users free access to support, through email, or our public forums. We even watch out for issues reported on Stack Overflow. Notice that I said “users”, and not “customers” – you have access to support even if you haven’t paid us a dime.

With CodeMix, we went one step further by adding a “Live Chat” system within the product, allowing you to have a live conversation with our engineers and support personnel. This is far better than a chat through our website, because we can collect diagnostic information from your IDE – more on that in a bit. It has proven to be a massive success so far; with a short chat, we’re able to fix most user issues immediately, and if not, we’ve gotten to bugs in the product much sooner than we would have with a forum or email based conversation. The real-time nature of chat has helped find issues that we probably wouldn’t have even found otherwise. So what do you get when you fix users’ problems in 5 minutes instead of a couple of hours? Even happier users! 

Live Chat Saves the Day

Back to our elusive performance issue – even with Live Chat, we needed users to chat with us while they were experiencing the problem, so we could diagnose it live, and that wasn’t happening too often. We would then have the user manually collect thread dumps at fixed intervals … an error prone process. Based on a few reports that did come in, we released a blind fix of sorts for the problem. With this release, we also added some additional diagnostics to our Live Chat – one of these was the ability to collect stack traces at fixed intervals, and all the user had to do was accept the request for this information, and we’d do the busy work collecting it.

When the hot fix went out, we were almost immediately disappointed to learn that it had not worked – a user came into the chat to report the same behavior with the latest release. However, this was also an opportunity to use our new diagnostics, and our new stack trace command revealed the culprit – an overzealous Outline reconciler. It was running way too frequently, and if your system was otherwise loaded, these jobs could start stacking up, hundreds of them. This is why systems could also mysteriously recover if left alone, as all the jobs would eventually run to completion.

Once we knew what the problem was, the fix was rather simple and we released a simple hot fix with it. In the two weeks since the fix has been live, we’ve had zero reports of the problem. Even better – a couple of users who would run into it on and off, took the trouble to tell us the new version was working great for them too!

Lessons Learned

We’ve been seeing the performance issue for a couple of months, and when in Live Chat, it wasn’t easy getting the right sort of technical information from our users, and it was tedious for them to collect. However, it took us just a few hours to add the right diagnostic tools to our release, and if had done this earlier, we’d have found and fixed this issue much sooner.

Being a small company, we don’t have large QA teams, and while we tried hard, we were unable to reproduce the core performance issue. What we should have done was increase the number and variety of our stress tests, testing CodeMix on a system under higher load for a longer period of time, which would have increased the likelihood of us being able to replicate this internally, making it much easier to diagnose and fix.

We’re introducing tests that do a better job of simulating real-world conditions under which CodeMix may be used, like alongside a browser that has gone AWOL, or Windows Defender deciding it’s a great time to run a full-system scan in the middle of your work day. Our augmented diagnostics are already in the product, and we’re continuing to stay vigilant about performance issues, from Eclipse, the Code engine, or wayward extensions.

CodeMix is the fastest it has ever been, and if you’re looking for an IDE that allows you to enjoy the best of both the enterprise and the modern-web development spaces, you need look no further – try CodeMix today, for a truly superlative dev experience.