Welcome to step 6 in your training as a scientific Python code ninja: issue tracking.
Look at you! You’ve got your nice code with some docstrings and tests, and you’re humming along analyzing data, simulating widgets, and whatnot. In the middle of it all, you realize that your regression analysis function breaks if you feed it more than 57 data points. And wouldn’t it be nice if it made a plot in addition to just printing out the correlation coefficient? And it would all be easier to use if your functions had default values for some of their arguments.
But you can’t stop to (fix that bug)/(add that feature)/(refactor that function) right now – you’re deep in all that beautiful data! Besides, maybe nobody will ever try to use this script with more than 57 data values. What should you do?
An issue may be:
Github – like other code repository services – has a built-in feature for tracking issues. When you raise an issue on your code’s Github repository, awesome things become possible:
With Github’s issue tracker, there’s all kinds of additional goodness: issues are searchable, can reference one another, can reference people, can be tagged/labeled, and more. Check out Github’s 10-minute explanation to learn all about it.
As usual, this should only take two minutes.
You’re now at a page that lists all the issues for your project (presumably there’s just one so far). If you click on an issue, you find a page where you can write further comments and even have a conversation with other developers. If you’ve never seen such a conversation, take a look at some issues for bigger projects like IPython.
It’s nice that you can keep your reminder list there in the issue tracker, but you may be worried about exposing all the shortcomings of your code in public. Don’t be! It’s much better to have them out in the open than to get surprised by them. In fact, the real magic happens when other people start reporting issues. If other people start using your code, you’ll find that somehow they run into a lot more problems than you do. Why? Because there are all kinds of unstated assumptions in your head that silently went into your code – but your users know nothing about them. So they will stress your code in completely different ways and help you find all sorts of wonderful bugs. They may even fix some of them for you – but that’s another subject. Just remember to be grateful – it’s easy to get your feathers ruffled when someone points out a flaw in your code, but in fact they are doing you a service.
This goes both ways. Your code almost certainly relies on a host of other projects, many of which are on Github or other servers with an issue tracker. Do you use Pandas, numpy, scipy, or scikit-learn? The next time you run into what might be a bug in those packages, be proactive. Of course you should first check Stack Overflow, but if it seems like a bug to you, you can go raise an issue on the project’s issue tracker. That’s right, anybody in the world can raise an issue – you don’t need to be one of the project developers. Just remember to be polite, and follow a few best practices when you report an issue.
Here’s what the issues page for my demo project looks like after raising a couple of issues. And here’s the issue tracker for a larger collaborative project. In academic research, I find that I open a lot more issues than I close. That’s okay – the issue tracker is not a to-do list that has to be fully completed at some point. It’s just a way of keeping track of useful improvements that could be made.