Clicky

🧐 Data Fitting 2a - Very, Very Simple Linear Regression in R

Note: This post was updated to include an example data file. I thought it might be useful to follow up the last post with another one showing the same examples in R. R provides a function called lm, which is similar in spirit to NumPy’s linalg.lstsq. As you’ll see, lm’s interface is a bit more tuned to the concepts of modeling. We begin by reading in the example CSV into a data frame: ...

February 16, 2011 · Ryan O'Neil

🧐 Data Fitting 2 - Very, Very Simple Linear Regression in Python

This post is based on a memo I sent to some former colleagues at the Post. I’ve edited it for use here since it fits well as the second in a series on simple data fitting techniques. If you’re among the many enlightened individuals already using regression analysis, then this post is probably not for you. If you aren’t, then hopefully this provides everything you need to develop rudimentary predictive models that yield surprising levels of accuracy. ...

February 15, 2011 · Ryan O'Neil

🗳 Off the Cuff Voter Fraud Detection

Consider this scenario: You run a contest that accepts votes from the general Internet population. In order to encourage user engagement, you record any and all votes into a database over several days, storing nothing more than the competitor voted for, when each vote is cast, and a cookie set on the voter’s computer along with their apparent IP addresses. If a voter already has a recorded cookie set they are denied subsequent votes. This way you can avoid requiring site registration, a huge turnoff for your users. Simple enough. ...

November 30, 2010 · Ryan O'Neil

🧐 Data Fitting 1 - Linear Data Fitting

Note: This post was updated to work with Python 3 and PySCIPOpt. The original version used Python 2 and python-zibopt. Data fitting is one of those tasks that everyone should have at least some exposure to. Certainly developers and analysts will benefit from a working knowledge of its fundamentals and their implementations. However, in my own reading I’ve found it difficult to locate good examples that are simple enough to pick up quickly and come with accompanying source code. ...

November 23, 2010 · Ryan O'Neil

🐍 Monte Carlo Simulation in Python

Note: This post was updated to work with Python 3. One of the most useful tools one learns in an Operations Research curriculum is Monte Carlo Simulation. Its utility lies in its simplicity: one can learn vital information about nearly any process, be it deterministic or stochastic, without wading through the grunt work of finding an analytical solution. It can be used for off-the-cuff estimates or as a proper scientific tool. All one needs to know is how to simulate a given process and its appropriate probability distributions and parameters if that process is stochastic. ...

October 8, 2009 · Ryan O'Neil

⚡️ On the Beauty of Power Sets

One of the difficulties we encounter in solving the Traveling Salesman Problem (TSP) is that, for even a small numer of cities, a complete description of the problem requires a factorial number of constraints. This is apparent in the standard formulation used to teach the TSP to OR students. Consider a set of $n$ cities with the distance from city $i$ to city $j$ denoted $d_{ij}$. We attempt to minimize the total distance of a tour entering and leaving each city exactly once. $x_{ij} = 1$ if the edge from city $i$ to city $j$ is included in the tour, $0$ otherwise: ...

February 27, 2009 · Ryan O'Neil

📐 Uncapacitated Lot Sizing

Uncapacitated Lot Sizing (ULS) is a classic OR problem that seeks to minimize the cost of satisfying known demand for a product over time. Demand is subject to varying costs for production, set-up, and storage of the product. Technically, it is a mixed binary integer linear program – the key point separating it from the world of linear optimization being that production cannot occur during any period without paying that period’s fixed costs for set-up. Thus it has linear nonnegative variables for production and storage amounts during each period, and a binary variable for each period that determines whether or not production can actually occur. ...

February 20, 2009 · Ryan O'Neil