For example, if we are doing Analytics recording the page_type, item_id, date, pageviews, timeOnPage.
It seems that they are several ways to avoid it. Is there an automatic way?
create index on the fields that uniquely identify the record, for example
[page_type, item_id, date]
and make the index unique, so that when adding the same record, it will reject it.or, make the above the primary index, which is unique, if the DB or framework supports it. In Rails, usually the ID 1, 2, 3, 4 is the primary index, though.
or, query the record using the
[page_type, item_id, date]
, and then update that record if it already exists (or don't do anything if the pageviews and timeOnPage already has the same values). If record doesn't exist, then insert a new record with this data. But if need to query the record this way, looks like we need an index on these 3 fields anyways.Insert new records all the time, but when query for the values, use something like
select * from analytics where ... order by created_at desc limit 1
that is, get the newest created record and ignore the rest. But this seems like a solution for 1 record but not so feasible when it is summing up values (doing aggregates), such as select sum(pageviews)
or select count(*)
.
Is there also some automatic solution besides using the methods above?