A way to compare floating point values

if(0.3==0.2) ...

You shouldn’t do that of course. But there is a nice workaround if you are satisfied with the numbers being within some defined epsilon proximity. A way to do this is to get the difference between them and compare it to some chosen epsilon. If it is less, declare them equal.

if(std::fabs(0.3-0.2)<1e-10) ...

But this looks nothing like comparing for equality. Also, good luck changing that epsilon consistently in a couple of years. Instinctive reaction when making something more managable is to make a function out of it.

bool equals(double v1, double v2, double eps=1e-10) { 
   return std::fabs(v1-v2)<eps; 
if(equals(0.3, 0.2, 1e-10)) ...

Better, but try reading it aloud. It’s how a robot from 60’s science fiction would speak. And which number is epsilon? (I take that back. It’s obvious in this case, but I insist it will not always be.)
Consider instead the following draft of a class:

class dbl
   double m_val;
   explicit dbl(double val) : m_val(val) {};
   dbl equals(double val) {
      return dbl(m_val-val);
   bool within_epsilon(double eps = 1e-10) {
      return std::fabs(m_val) <= eps;
if(dbl(0.3).equals(0.2).within_epsilon(1e-10)) ...

Yes, it’s a builder pattern for building the simplest of all data types, but the meaning couldn’t be clearer and it reads like a poem! But … performance, you say. I measured. With my setup, they all equally take just over 100 CPU cycles.

Using smart pointers in building trees in which child nodes need an access to the parent

Consider building a tree. Often it is enough if only a parent node is able to reference child nodes, but this is not always the case. If parent node creates a child node that will later need an access back to the parent, what is the best way for a parent node to pass a reference to itself to the child? The only proper solution I could come up with was to inherit from std::enable_shared_from_this class and pass on the shared_ptr of pointer this. Luckily, the C++11 creators were smart enough to include that class to the standard. Other alternatives I considered have more serious shortcomings than this one. Continue reading

Error: unable to find numeric literal operator

Since C++11 we have an option to define our own literals. They look like function calls in disguise but they have some unexpected behaviour. You can for instance write:

constexpr long double operator"" 
_kilometers (long double km) 
{ return km*1000; } // meters

And then you might try to use your new literal as:

double dist1 = 5000_kilometers;
double dist2 = 5000.0_kilometers;

But you are greeted with nice error on first attempt: “unable to find numeric literal operator”.

Continue reading

How to synchronize working folders between different drives and computers

Consider my use case. I have all my important digital stuff encrypted in a TrueCrypt volume. I carry that volume around with me on a usb drive. All my projects, programming and others, are in that volume. It allows me to pick up my work and continue from any computer. For practical purposes I have a working copy of folders I currently work on, copied to a hard drive on my home computer as well. This is where it might get messy. Some changes for project could be on usb drive, some on home machine hard drive. For this I have written a simple bash script, that synchronizes all desired folders on both/all drives, while allowing each copy to be modified. Continue reading

Solving Sudoku with A*

A* algorithm is used for searching shortest paths, but is widely being adapted to search solutions for puzzles, that can be represented in form of the graph. Puzzles for which it is suitable the most are ones that do not have a very big number of possible next moves, like for instance 15-puzzle. Despite the fact that Sudoku is different in that sense, it has been attempted before [pdf] and it was successful.
Continue reading

Cosmic ray threat?

I was intrigued  by the post on Ksplice  blog titled Attack of the Cosmic Rays. In it author tracked an error in the expr program, that occasionally had a segfault, to a single randomly flipped bit in the RAM cache of the program. The author guesses it was due to cosmic ray hit and offers ZDNet article DRAM error rates: Nightmare on DIMM street as a clue. It also states that such errors are quite common, 1 bit per 4GB per day, cited from On the need to use error-correcting memory, written in 2010.
Continue reading

Beginner’s view on Python

I didn’t use Python much, just one smaller project (more about it) and assignments for second part of Stanford’s algorithms class on Coursera. Not even that: when the push came to shove at TSP size 25, I had to resort to more familiar C++, which turned out to be significantly faster for this kind of programming. Minutes versus hours. I like the language, especially it’s dynamic declarations of everything. C and C++ have limited notion of functions as first-class citizens and stuff like that was pretty new to me. It was interesting to me, how I couldn’t seem to find good use for Python features, that are unavailable in C++, where I come from :) This is how you know you’re thinking in-the-box. At the same time, this is why it is useful to learn another language. I wanted to learn basics of Python and I did. I must repeat the disclaimer for all Python fans out there: I like it :) This is merely a list of things that I had problems with as a newbie in the language.
Continue reading

C# Implementations of Binary Heaps

I needed min-heap in C# for my experiment with Sudoku solving A* and surprisingly I didn’t find it in .NET environment. There were some implementations available online, but far less then I expected. Heap is such a basic data structure, I expected an overwhelming amount of implementations in C# for various school projects or such. I just implemented it myself. Initially I only wanted to implement min-heap but later it evolved into a project of it’s own. And I still wonder, why my projects last so long, if they are lucky enough to be finished at all.

TL;DR: here is GitHub repository of Blitva, a tiny collection of C# tools, which includes described heap implementations. This is MIT licensed. You can use it any way you want.
Continue reading

This is the error, that appears when the project is started in Debug mode.

Some annoying glitches in Visual C# 2008

First let me say this: Visual Studio is a top quality tool. It is fast and especially it’s debugging capabilities are the best I’ve seen in the industry. Having said that, there are some glitches that look like they are random, without any logic behind it, and can only be fixed by guessing and doing silly things. I’ll document two of them, to help someone save an hour or two.
Continue reading

How I implemented language aware web scraper

I needed a huge sample of live Slovenian language. The simplest way would be to use digitized books, but that comes with it’s own problems. Many books are old, written in somewhat archaic language and a lot of talkative words and slang are not found in books. There is also trouble with accessing huge volumes of it since many might have some copyright limitations attached. Another thing I could do is parse known news sites, but I’d still be left with slang usage problem. The problem called for a customized web scraper, which would allow me to access vast amounts of people written content on the web. One of the problems in writing such a program was language detection, since I only needed words from specific language, otherwise I would only get useless mess out of it.
Continue reading

My view of Coursera and Algorithms: Design and Analysis course

I recently attended Coursera course Algorithms: Design and Analysis, Part 1 by professor Tim Roughgarden from Stanford University. It was great. The course was well prepared and I have only the best things to say about it. This review is a mix of feelings about this concrete course and the Coursera itself, since this course was my first introduction to Coursera.
Continue reading

Reversing graphs in Kosaraju’s algorithm is in fact necessary

Kosaraju’s algorithm for finding strongly connected components requires you to reverse graph, calculate finishing times, reverse the graph again (if you don’t want to hold two different graph structures in memory) and then follow finishing times in reverse order. From unvisited node with biggest finishing time, to the unvisited node with smallest finishing time.
Continue reading

Attempt to find modulo bias in .NET randomization

Dice five

Dice five (Photo credit: @Doug88888)

Modulo bias is an enemy of all randomized algorithms. This can easily introduce systematic randomization bias where we expect none. Is there modulo bias in .NET randomization? I tried to discover it in Random::Next(int) function. This function takes a range from which it returns a random number and if the initial random number is narrowed down to specified range by modulus operator, maybe I could show some modulo bias signal. Continue reading

Improving the quality of large legacy codebase by incrementally enforcing encapsulation

I was thinking of a way to help maintain a large complicated legacy project if it was at least meant to have object-oriented design. The problem with such projects is they were developed by an army of different people over time and there’s a chance some of them weren’t as skillful in object-oriented ways as we’d like. The blame is not only on them alone. OO paradigm is decades old, but the knowledge of how to use it most efficiently and how to handle cases that look like procedural problems has been piling up slowly. For instance, legendary Design Patterns book by the Gang of Four wasn’t published until 1995. And a little design mistake in coding 15 years ago will accumulate in technical debt, as over the years engineers were in constant hurry to deliver, without having time to refactor (if they were even aware of bad design at the time).
Continue reading