Course Materials

Introduction to Python Programming and Data Science

Since the Fall of 2014, we have run an intensive introductory course on Python programming at Northwestern University. Our bootcamp has been attended by about 350 members of the Northwestern community, including undergraduate and graduate students, postdoctoral fellows, faculty and staff.

Software

How to send periodic local notifications to your users

Today, I want to talk about an Apple Framework: User Notifications.

Based on Apple’s website, this is a simple Framework that allows us to push user-facing notifications to the user’s device from a server, or generate them locally from our app.

I will not cover all the functionalities that this Framework provides, and instead just focus in in a simple problem that I had:

Send a periodically notification to the user only if some condition becomes true. For example, the notification that the Activity ring app (installed by default in all the Apple Watches) sends to the user every hour and 50 minutes if the user did not stand up in the last 50 minutes.

We can simplify the example like:

We want users do some task every day and alert them at 5 pm if they did not do the task yet. Basically, we can reset a bool variable at midnight and set it when the user does the task on the device, showing the notification at 5 pm if the variable is false.

NetCarto

netcarto is a command line tool for finding modules (and node roles) by maximizing modularity with simulated annealing.

NullSeq

NullSeq is a tool for generating random nucleotide coding sequences with amino acid and GC content constraints.

Topic Mapping

TopicMapping finds topics in a set of documents using network clustering (Infomap) as a guess for LDA likelihood optimization. This guide will tell you how to compile, what are the input and output format, how to tune the algorithm’s parameters, and some more.

Data Sets

Gun violence at US Schools

This dataset accompanies the work in "Economic insecurity and the rise in gun violence at US schools".

Passing and Shooting Data for Euro 2008

Companion data for "Quantifying the performance of individual players in a team activity" by Duch J, Waitzman JS, Amaral LAN (PLoS ONE 5, e10937, 2010).

Researchers biographical and publication data

Companion dataset to "The possible role of resource requirements and academic career-choice risk on gender differences in publication rate and impact" by Duch J, Zeng XHT, Sales-Pardo M, Radicchi F, Otis S, Woodruff TK, Amaral LAN ( PLoS ONE 7, e51332, 2012).

US film citation network

List of citations to films produced in the US by other films also produced in the US.

Guides

Collections

From a programmer's perspective, taking care of memory and time are the most important issues. Computers have limited memory and accessing it has a computational cost.

The first step I always do before I start programming is to think about the problem and the data structure. To define the data structure well, it is necessary to know what will be the way to access the data. For example, how to iterate over the elements, access an element or insert elements. Also, it is important to know the relationship between the elements: are they unique, do they aggregates, or is there an order?

Interactive web graphics with d3.js

A presentation that details making a simple, interactive line graph in the browser using d3.js. A mercurial repository accompanies this presentation and contains the demo for following along.

IPython Notebooks

This was a presentation and sample notebook that I whipped up for the Amaral lab to explain why and some basic hows of using iPython notebooks. This goes along with my previous blog post and has this gist to go along with it.

Mounting a remote folder on OS X over SSH

The current project I am working on needs to access to a folder on a remote server. It seems to be a simple task, but there is one issue: I am a Mac user.

Mounting a server folder is very useful if you have a lot of data to share with your colleagues. It is insane to copy it to your hard drive every time it changes or manage large amounts of data with version control since it will slow down the repository.

The best solution we found in the lab is using SSH and mounting folders using sshfs. It works really well in Linux and we don't want to use a different system for other operating systems.

pyenv Tutorial

Meet pyenv: a Simple Python Version Management tool. Previously known as Pythonbrew, pyenv lets you change the global Python version, install multiple Python versions, set directory-specific Python versions, and create/manage virtual python environments. All this is done on *NIX-style machines (Linux and OS X) without depending on Python itself and it works at the user-level–no need for any sudo commands. So let’s start!

Setting up a new development environment

Setting up your development environment on a new computer can be a pain. This guide will show you how you can take your existing environment and put them into an installer script.

Speed up your Python & Numpy codes

If you run short simulations, you may tell yourself that you don’t need faster code because it only takes a few of seconds -or up to a couple of minutes- and you don’t want to “waste” your time learning non interesting coding tricks. However, my experience tells me than good programming habits are easier to learn than bad ones, they decrease the probability of having bugs in your code, and you'll have a clearer and better organized result.