algorithm

The Most Efficient Way To Find Top K Frequent Words In A Big Word Sequence

The question can be described as: Input: A positive integer K and a big text. The text can actually be viewed as word sequence. So we don't have to worry about how to break down it into word sequence. Output: The most frequent K words in the text. My thinking is like this. 1) use a Hash table to record all words' frequency while trave...

Finding the LCM of a range of numbers

I read an interesting DailyWTF post today, "Out of All The Possible Answers..." and it interested me enough to dig up the original forum post where it was submitted. This got me thinking how I would solve this interesting problem - the original question is posed on Project Euler as: 2520 is the smallest number that can be divided by...

Tree Algorithm

I was thinking earlier today about an idea for a small game and stumbled upon how to implement it. The idea is that the player can make a series of moves that cause a little effect, but if done in a specific sequence would cause a greater effect. So far so good, this I know how to do. Obviously, I had to make it be more complicated (beca...

How are ssl certificates verified?

What is the series of steps needed to securely verify a ssl certificate? My (very limited) understanding is that when you visit an https site, the server sends a certificate to the client (the browser) and the browser gets the certificate's issuer information from that certificate, then uses that to contact the issuerer, and somehow com...

Is there a simple algorithm that can determine if X is prime, and not confuse a mere mortal programmer?

I have been trying to work my way through Project Euler, and have noticed a handful of problems ask for you to determine a prime number as part of it. 1) I know I can just divide x by 2, 3, 4, 5, ..., square root of X and if I get to the square root, I can (safely) assume that the number is prime. Unfortunately this solution seems quite...

Pronouncable passwords?

Modules or software solutions for generating English pronounceable passwords? Are there similar modules for other languages? ...

How to pick what unit to display a value in?

I have a value and I known that it's units is meters^(mn/md) * kg^(kn/kd) * s^(sn/sd) * K^(Kn/Kd) * A^(An/Ad) Note: the exponents are rational, units of m^0.5 are valid The question is how to pick how to break down the units into something more compact for instance if md=kd=sd=Kd=Ad=1 mn=Kn=An=0 kn=1 sn=-1 I can use N/m I suspec...

Open source random number generation algorithm in C++?

I need to generate random numbers in the range 1 - 10000 continuously with out duplication. Any recommendations? Description: we are building a new version for our application, which maintains records in Sqlite DB. in the last version of our application, we did not had unique key for each record. But now with new upgraded version, we n...

most efficient character counting algorithm?

Let's say you want to count the occurances of chars in some text. The fastest way i could think of was to use an array like unsigned char charcounts[256], initialize it to zeros, then look at each char in the text input and do charcounts[c]++. then linear search charcounts[] using two vars to keep of track of the lowest (so far) char an...

When is loop unwinding effective?

Hi all, Loop unwinding is a common way to help the compiler to optimize performance. I was wondering if and to what extent the performance gain is affected by what is in the body of the loop: number of statements number of function calls use of complex data types, virtual methods, etc. dynamic (de)allocation of memory What rules (of...

Creating your own Tinyurl style uid

I'm writing a small article on humanly readable alternatives to Guids/UIDs, for example those used on TinyURL for the url hashes (which are often printed in magazines, so need to be short). The simple uid I'm generating is - 6 characters: either a lowercase letter (a-z) or 0-9. "According to my calculations captain", that's 6 mutually...

Latent Dirichlet Allocation, pitfalls, tips and programs

I'm experimenting with Latent Dirichlet Allocation for topic disambiguation and assignment, and I'm looking for advice. Which program is the "best", where best is some combination of easiest to use, best prior estimation, fast How do I incorporate my intuitions about topicality. Let's say I think I know that some items in the corpus a...

What is the most efficient/elegant way to parse a flat table into a tree?

Assume you have a flat table that stores an ordered tree hierarchy: Id Name ParentId Order 1 'Node 1' 0 10 2 'Node 1.1' 1 10 3 'Node 2' 0 20 4 'Node 1.1.1' 2 10 5 'Node 2.1' 3 10 6 'Node 1.2' 1 20 What minimalistic appro...

Efficiently querying one string against multiple regexes.

Lets say that I have 10,000 regexes and one string and I want to find out if the string matches any of them and get all the matches. The trivial way to do it would be to just query the string one by one against all regexes. Is there a faster,more efficient way to do it? EDIT: I have tried substituting it with DFA's (lex) The problem he...

Multiplication of very long integers.

Is there an algorithm for accurately multiplying two arbitrarily long integers together? The language I am working with is limited to 64-bit unsigned integer length (maximum integer size of 18446744073709551615). Realistically, I would like to be able to do this by breaking up each number, processing them somehow using the unsigned 64-bi...

What language/platform would you recommend for CPU-bound application?

I'm developing non-interactive cpu-bound application which does only computations, almost no IO. Currently it works too long and while I'm working on improving the algorithm, I also think if it can give any benefit to change language or platform. Currently it is C++ (no OOP so it is almost C) on windows compiled with Intel C++ compiler. ...

Unique random numbers in O(1)?

The problem is this: I'd like to generate unique random numbers between 0 and 1000 that never repeat (I.E. 6 doesn't come out twice), but that doesn't resort to something like an O(N) search of previous values to do it. Is this possible? ...

Anyone know of a good algorithm for rendering an HTML table to an image?

There is a standard two-pass algorithm mentioned in RFC 1942: http://www.ietf.org/rfc/rfc1942.txt however I haven't seen any good real-world implementations. Anyone know of any? I haven't been able to find anything useful in the Mozilla or WebKit code bases, but I am not entirely sure where to look. I guess this might actually be a deep...

Non-deterministic finite state machines in software development?

Recently I've been thinking about finite state machines and how I would implement them in software (programming language doesn't matter). My understanding is that deterministic state machines are in widespread use (parses/lexers, compilers and so on). Okay, that's great. But what's the matter with non-deterministic state machines? I kn...

Graph Theory library for Smalltalk

Anybody know of an implementation of graph algorithms in Smalltalk? I'd like something that allows you to implement an interface on your model objects or something and provides algorithms for transitive closure, transitive reduction, topological sort, etc., etc. People end up re-implementing these widely-applicable algorithms so often,...