database-design

how does StackOverflow optimise the performance for the display of the questions?

Hi there, i am trying to learn c#.net to program a web app. And having learned that stackoverflow uses C#.net I am happy to discover it. I noticed that at the home page or at the questions section, whenever i refresh the page. The page always returns me the latest information without fail and at acceptable speeds. I am not sure how ...

How to relate multiple models to one model that will rule them all, in Rails?

Let's say I have four completely independent models (Movie, Book, Game, Album) that control the types of things I have in my media collection. With them I can CRUD and tag individual albums, movies etc. But I need to keep track, and do some stuff that is common to, all of them. So I figured I need an Item model that would give me a tabl...

When to include related fields in a db table and when not to?

I have a table of "posts" and I want to keep track of the ratings for each post. Normally, I would add a rating field with int values in that same table which references another "rating" table that holds actual rating data. Is there anything wrong with removing that rating field and calling the ratings directly from the rating table base...

Devising a test for a Web Developer

I need to devise a test for web developers. This test should screen both good grasp of the DOM and manipulating it and Good skills in designing scalable and efficient DB and server side code, And to salt it a bit with web specific problems (like translating from one encoding to another, clean input-security). And best of all, cram it int...

Which way to structure Data for User Generated Form Templates

Hello, I'm thinking about an idea for something, as well as learning Ruby on Rails (easy eh :) ). I want something to allow a user to generate forms as templates, then assign these templates as forms in a location in a tree hierarchy, then allow users to fill in instances of these forms and save the data. So, I've got two different t...

Hadoop Hbase: Spreading column families across tables or not

The Hbase documentation makes it clear that you should group similar columns into column families, because the physical storage is done by column family. But what does it mean to put two column families into the same table, as opposed to having separate tables per column group? Are there specific cases when "partitioning" tables this w...

How to separate automatically populated tables from manually populated tables, properly, in SQL Server?

Lets say I have the following 2 tables in a database: [Movies] (Scheme: Automatic) ---------------------------- MovieID Name [Comments] (Scheme: Manual) ---------------------------- CommentID MovieID Text The "Movies" table gets updated by a service every 10 minutes and the "Comments" table gets updated manually by the users of the d...

MySQL: Views vs Stored Procedures

Since MySQL started supporting stored procedures, I've never really used them. Partly because I'm not a great query writer, partly because I often work with DBAs who make those choices for me, partly because I'm just comfy with What I Know. In terms of doing data selection, specifically when considering a select that is essentially a d...

How do I 'refactor' SQL Queries?

I have several MS Access queries (in views and stored procedures) that I am converting to SQL Server 2000 (T-SQL). Due to Access's limitations regarding sub-queries, and or the limitations of the original developer, many views have been created that function only as sub-queries for other views. I don't have a clear business requirement...

Best optimizing a large DB for primary key queries

Suppose you have a very large database, and to simplify lets say it consists of one major table you will be doing your lookups on with one (and only one) primary key field - pk. Given the fact that all lookups are going to be basically SELECT * FROM table_name WHERE pk=someKeyValue, what is the best way to optimize this database for the...

Help figuring out approaches to (near) real time multi dimensional data querying

I have a system that involves numerous related tables. Think of a standard category/product/order/customer/orderitem scenario. Some tables are self referencing (like Categories). None of the tables are particularly large (around 100k rows with an estimated scale to around 1 million rows). There are a lot of dimensions to this data I ...

Does selecting only indexed attributes result in faster queries?

When performing a query where the attributes selected make up the components of an index does that result in a faster query? I would imagine that the query planner/optimizer could see that the requested columns could be satisfied completely by the index scan. Trivial Example CREATE TABLE "liked" ( "id" BIGINT NOT NULL DEFAULT nextva...

What are some best practises and "rules of thumb" for creating database indexes?

I have an app, which cycles through a huge number of records in a database table and performs a number of SQL and .Net operations on records within that database (currently I am using Castle.ActiveRecord on PostgreSQL). I added some basic btree indexes on a couple of the feilds, and as you would expect, the peformance of the SQL operati...

Generating reports from MySQL tables

Let's say you have a bunch of MySQL tables, and you want your end users to be able to generate reports with that data with a PHP script. You can present the field names from those tables in a dropdown, so a user might be able to say, "first_name equals John." Great. But what if you want those field names to be a little more readable? For...

Best structure for centralized User DB over multiple membership-driven sites?

We've built a social networking site for a client. It did very well and now they want to package it up and license multiple copies of the same site but branded for their client. Each site is fairly autonomous except that users on one site can access the content from users on another site, requiring that user profiles be centralized. W...

Database efficiency

I am about to write a program to keep track of my school assignments and I was wondering what database language would be the most efficient and simple to implement to track the meta-data of the assignments? I am thinking about XML, but it would require several documents. I (currently) have at least ten assignments per week for 45 weeks....

Re-indexing large table - how screwed am I?

I have a 1 TB, 600m row, table which has a misguided choice of indexed columns, specifically a clustered index on the primary key column which is never used in a select query. I want to remove the clustered index from this row and create it on a number of other rows. Table is currently like this: colA (PK, nvarchar(3)) [clustered i...

Database Design - Best way to show available hours?

I am interested in seeing suggestions for a database design regarding business hours. It would be quite similar to what Facebook has - I have a list of businesses, and I would like for users to be able to input multiple sets of available hours for that business. e.g., Monday: open 9-5; Tuesday: open 9-12; 1-5; etc. I would not like ...

Is any group or foundation developing an algorithm for better storing massive amounts of data?

I've looked at several approaches to enterprise architecture for databases that store massive amounts of data, and it usually comes down to more hardware, database sharding, and storing JSON objects. Has any group been doing research, or does anyone have a more dynamic approach that processes the available data and tells you how to bette...

Storing data packets in a database

Problem description: In my application, I have to present the contents of data packets with a certain format. An example: An example Any packed binary data, for example: 4 byte header, 4 byte type (type codes having pre-defined meanings), then source address, destination address, and so on. Previously, I made home cooked implementati...