language-agnostic

compact data structure like set

i am looking for a specific data structure, but i forgot its name. if i knew the name it would be trivial, i would just look it up in wikipedia :) basically, it is like a set - except you cannot iterate it. you put some values in it, lets say 80k zip codes. then you can test if a given string is definately NOT a zip code, but you will...

PDF creation - Tags - Authouring

This is so vague it's ridiculous but who knows... We have got this client who will not budge - they are supplying PDF files auto generated by their own software. These files don't import into our (printing) lab management software - made by kodak. So I emailed Kodak the error log and relevant files and got this back.. DP2 supports th...

Tinyurl-style unique code: potential algorithm to prevent collisions

I have a system that requires a unique 6-digit code to represent an object, and I'm trying to think of a good algorithm for generating them. Here are the pre-reqs: I'm using a base-20 system (no caps, numbers, vowels, or l to prevent confusion and naughty words) The base-20 allows 64 million combinations I'll be inserting potentially...

Does e-mail obfuscation really make automatic harvesting harder?

Many users and forum programs in attempt to make automatic e-mail address harversting harder conseal them via obfuscation - @ is replaced with "at" and . is replaced with "dot", so [email protected] now becomes team at stackoverflow dot com I'm not an expert in regular expressions and I'm really curious - does such obfuscatio...

How to implement dead reckoning when turning is involved?

"Dead reckoning is the process of estimating one's current position based upon a previously determined position and advancing that position based upon known or estimated speeds over elapsed time, and course." (Wikipedia) I'm currently implementing a simple server that makes use of dead reckoning optimization, which minimizes the upd...

How am I supposed to know how many days something will take?

I am a PHP developer, and I often have no idea in terms of days--let alone hours--how long something will take me at work. I am often writing new stuff, merging it with old legacy crap. I can tell my boss what week I will likely have something done--and maybe what half of what week--but I how in the world am I to know specifically what d...

Superset Search

I'm looking for an algorithm to solve the following in a reasonable amount of time. Given a set of sets, find all such sets that are subsets of a given set. For example, if you have a set of search terms like ["stack overflow", "foo bar", ...], then given a document D, find all search terms whose words all appear in D. I have found tw...

AOP and user-specific data storage

I have been using AOP for "classic" things like logging and security for a while and am starting to take it further. One problem I come across frequently with desktop applications is the need to store user-specific data locally. To that end, I have built a component that works well for me that stores data as XML in an application-speci...

Getting the path & filename of the open document in any Windows application

Goal Let me start with my final vision of what I'd like to be able to do first: In Windows, I'd like to be able to use a global keyboard shortcut that I define (say, Ctrl+Alt+C) to copy the full path and filename of the open document in the foreground application to the clipboard. This would be useful to, for example, be able to subs...

Sort items with minimal renumber

I need to quickly save a re-ordered sequence back to my items' integer sortOrder columns. The simple renumber-by-one approach can be slow - if last item moved to first, all N rows are modified. A multi-row update statement would let database do the work, but I'd like to explore smarter ways, like making sortOrder floating point except...

Is there an Open Source library of some sort that identifies data patterns in a table?

Okay, here's the situation: We have a table of about 50 columns (created by joining database tables) and several thousand rows. We need to identify a pattern in several known faulty records of that data. Here's a really boiled down example. Given a table: ----------------------- | id | title | date | ----------------------- | 01 | c ...

Should I change code to make it more testable?

Hi, I often find myself changing my code to make it more testable, I always wonder whether this is a good idea or not. Some of the things I find myself doing are: Adding setters just so I can set an internal object to a mock. Adding getters for internal maps/lists so I can check the internal state of the object has changed after perf...

Knowledge required to build your own integer class?

Upon reaching a brick wall with the .Net framework's lack of a BigInteger class (yet), I've decided I'd like to develop my own as an exercise (I realize open source alternatives exist). What hoops do I need to jump through to be able to develop this? Is there any particuliar knowledge pieces that I probably wouldn't have? edit: side q...

Array Fill w/Grouping

This is one of those, " I wish I had listened better, retained more of my math class info..", questions. I did this with brute force, but I know there's a better more correct way of accomplishing this. Given a 4 x 4 array of stations, and 8 groups(a-h). How to fill the array with group pair combinations so each pair(ab, ba) occurs only ...

Smooth average of sales data

How can I calculate the average of a set of data while smoothing over any points that are outside the "norm". It's been a while since I had to do any real math, but I'm sure I learned this somewhere... Lets say I have 12 days of sales data on one item: 2,2,2,50,10,15,9,6,2,0,2,1 I would like to calculate the average sales per day with...

What should NOT be under source control?

It would be nice to have a more or less complete list over what files and/or directories that shouldn't (in most cases) be under source control. What do you think should be excluded? Suggestion so far: In general Config files with sensitive information (passwords, private keys etc.) Thumbs.db, .DS_Store and desktop.ini Editor backup...

Are there any conventions for flowcharting that distinguish a switch from a if-else chain?

I had to do a overview for a customer meeting, and they requested flow charts. It had never occurred to me that there was no switch symbol in any of the flow charting I've seen. I know functionally they are similar, but documentation should represent the code you've written or are planning too. Maybe I'm just being picky, but it seem...

How to identify what is not a Class?

I know the rule of thumb is that a noun used by the user is potentially a class. Similarly, a verb may be made into an action class e.g. predicate Given a description from the user, how do you - identify what is not not to be made into a class ...

what value does null really have?

(I know what null is and what its is used for) Question: OK, say we make a reference to an object in whatever language. The computer makes a little 32-bit (or other size, depending on computer's design) space in memory for that reference. That memory can be assigned to a value that represents an object's location in memory. But when I s...

What general purpose language should I learn next?

I'm currently participating in a programming contest (http://contest.github.com), which has as goal, to create a recommendation engine. I started coding in ruby, but soon realised it wasn't fast enough for the algorithms I had in mind. So I switched to C, which is the only non-scripting language I know. It was fast, of course, but I crin...