26

Trello shows a historial log of everything that any user has done since the board's inception. Likewise, if you click on a specific card it shows the history of anything anyone has done related to that card.

Keeping track of every change/addition/deletion that is kept indefinitely must collect a ton of data and also potentially bottleneck on writing to the history trail log (assuming it is written immediately to a data store of sorts). I mean, it isn't like they are storing everything in log files spread across 1000's of servers that they only collect and parse when they need to find something -- they are displaying all of this info all the time.

I know this isn't the only service that provides something like this, but how would you go about architecting such a system?

4

3 に答える 3

34

私はTrelloチームに所属しています。MongoDBインスタンスでActionsコレクションを使用し、参照先のモデルのID(カードはモデルであり、メンバーも同様)とアクションが実行された日付の複合インデックスを使用します。インデックスと最近使用されたドキュメントがDBによってメモリに保持されている場合を除いて、派手なキャッシュなどはありません。アクションは私たちの最大のコレクションです。

アクションを表示するために必要なデータのほとんどは、非正規化されてアクションドキュメントに保存されるため、処理速度が大幅に向上します。

于 2012-05-14T02:07:15.623 に答える
3

The easiest way that comes to mind is to have a table like:

create table HistoryItems (
ID INT PK,
UserID INT PK,
DateTime datetime,
Data varbinary(max)/varchar(max)/...)

Indexing this on UserID allows for fast retrieval. A covering index would enable fetching the history of an entire user in one disk seek no matter how long it is.

This table could be clustered on (UserID asc, DateTime desc, ID) so you don't even have to have any index at all and still have optimal performance.

Any easy problem for a relational database.

于 2012-05-08T19:56:08.327 に答える
1

I have something very similar as @Brett from Trello answered above in my PHP + MySQL app which I use for tracking user activity in our order and production management app for our online web store.

I have table activities which holds:

  • user_id: user that performed action
  • action_id: the action that was performed (e.g. create, update, delete, and so on...)
  • resource: the ENUM list of resources (models) that action was performed on (e.g. orders, invoices, products, etc...)
  • resource_id: PK of the resource that action was performed on
  • description: text description of the action (can be null)

It's a large table indeed, but with right indexes it handles very well. It acts it's purpose. Is simple and fast. Currently it holds 200k records and growing with cca. 1000 new entries per day.

于 2013-09-16T15:19:09.217 に答える