Jump to content



Photo

DB Structure for Storing Financial Statements


  • Please log in to reply
8 replies to this topic

#1 generalt

generalt

    Neowinian

  • Joined: 09-May 07

Posted 26 December 2012 - 17:39

I want to store company financial information in a MySQL database. The database will need to store information from companies' balance sheets, income statements, and statements of cash flow for multiple years. I need to store revenues/net income/etc for each year since 2000 for each company in the database. The only way I can think to make a table like this would be to make a column named 'revenue' and store the information in an array. But that way seems really messy; is there a better alternative (besides hard-coding '2011 Revenue,' '2010 Revenue,' etc?


#2 OP generalt

generalt

    Neowinian

  • Joined: 09-May 07

Posted 29 December 2012 - 14:28

Er, how about this: I plan on building a gigantic table where '2011 Revenue' '2010 Revenue' ... '2011 Net Income' etc will be stored on one row for each company. Can anyone think of a better way to do this? (Would storing all of the information in arrays be better?).

#3 Kami-

Kami-

    ♫ d(-_-)b ♫

  • Tech Issues Solved: 2
  • Joined: 28-July 08
  • Location: SandBox

Posted 29 December 2012 - 16:12

Eh?

#4 James Rose

James Rose

    Software Developer

  • Tech Issues Solved: 1
  • Joined: 20-January 04
  • Location: New York City

Posted 29 December 2012 - 16:18

You're not giving us much information on the details, so it will be difficult to give you a schema. Basically you will want to look into http://en.wikipedia....e_normalization Also I would HIGHLY recommend encrypting some or all of this data.

Good luck.

#5 DomZ

DomZ

    Neowinian Senior

  • Joined: 12-October 04
  • Location: Wales, UK
  • OS: OSX
  • Phone: Lumia 1020 / Nexus 4 / iPhone 4

Posted 29 December 2012 - 16:57

It also depends on how much information you have.

If you were only going to have one record for revenue for each year, you'd want something like

yearkey - revenue

which you could then query to get a revenue for a year (or multiple years)

having columns:

revenue_2011 revenue_2012 etc

is almost always a bad idea, although it depends again on what the makeup of the data is. If you have millions of companies revenues, and you were more interested in querying (or comparing) revenues from given years against companies, this might be a better structure for you - but I doubt it even if that scenario.

#6 James Rose

James Rose

    Software Developer

  • Tech Issues Solved: 1
  • Joined: 20-January 04
  • Location: New York City

Posted 29 December 2012 - 18:05

having columns:

revenue_2011 revenue_2012 etc

is almost always a bad idea,


THIS!!!!! Do not have columns for each date,month or year. Please PLEASE do not do this. (WOW have I had to go into so many jobs that use this sort of model!) If there are to be multiple dates, then a separate table is needed.

#7 OP generalt

generalt

    Neowinian

  • Joined: 09-May 07

Posted 10 January 2013 - 23:54

You're not giving us much information on the details, so it will be difficult to give you a schema.


Go here and ctrl + f "ITEM 8. FINANCIAL STATEMENTS AND SUPPLEMENTARY DATA"
I'm basically going to take the information from the financial statements and put it in a database. I'm going to do it for a few hundred companies and will make the database searchable based on the financials (e.g., average free cash flow margin > 10%).

THIS!!!!! Do not have columns for each date,month or year. Please PLEASE do not do this. (WOW have I had to go into so many jobs that use this sort of model!) If there are to be multiple dates, then a separate table is needed.


This is what I was trying to avoid. How do I get around it?

#8 NateB1

NateB1

    Neowinian Senior

  • Joined: 09-January 07

Posted 11 January 2013 - 00:20

It depends on how robust and maintainable you want your database to be. Just looking at the website you linked to, I've come up with the following model, assuming you want data on a yearly basis, written in pseudo-SQL:

--The below table contains a row for each line item in the statement - "Operating expenses", "Net Income", etc.
LineItems (
LineItemID int primary key identity,
LineItemName varchar(100),
IsExpense bool <- This is so you can easily filter between expenses and income - false is income, true is expense. You could also flip this.
)

--This stores info about the company.
Companies (
CompanyID int primary key identity,
CompanyName varchar(50)
)


--This is your primary table, the table where you will store the financial info. It links to the above tables.
Finances(
FinanceID int primary key identity,
CompanyID int,
Year smallint, --assuming you're aggregating by year - you should only have one line per line item per company per unit of time
LineItemID int,
Amount currency
)

This way, the database will remain compact, as you won't be storing a bunch of characters for every line item in the financial table, and it will be maintainable. I'd recommend indexing all foreign keys (the IDs that refer to other tables) if you database grows beyond thousands of rows, as performance will begin to suffer. You can construct a view to join the tables together into one giant table if you want.

This model will also have the benefit of being able to extend to an infinite number of years, should you wish to keep doing this. I'm used to MS SQL, but check to see if MySQL has a PIVOT function where you can pivot the "Year" rows into columns - that way you can create a view or return results to your client(s) identical to how it appears on the site.

#9 OP generalt

generalt

    Neowinian

  • Joined: 09-May 07

Posted 11 January 2013 - 17:03

It depends on how robust and maintainable you want your database to be. Just looking at the website you linked to, I've come up with the following model, assuming you want data on a yearly basis, written in pseudo-SQL:

--The below table contains a row for each line item in the statement - "Operating expenses", "Net Income", etc.
LineItems (
LineItemID int primary key identity,
LineItemName varchar(100),
IsExpense bool <- This is so you can easily filter between expenses and income - false is income, true is expense. You could also flip this.
)

--This stores info about the company.
Companies (
CompanyID int primary key identity,
CompanyName varchar(50)
)


--This is your primary table, the table where you will store the financial info. It links to the above tables.
Finances(
FinanceID int primary key identity,
CompanyID int,
Year smallint, --assuming you're aggregating by year - you should only have one line per line item per company per unit of time
LineItemID int,
Amount currency
)

This way, the database will remain compact, as you won't be storing a bunch of characters for every line item in the financial table, and it will be maintainable. I'd recommend indexing all foreign keys (the IDs that refer to other tables) if you database grows beyond thousands of rows, as performance will begin to suffer. You can construct a view to join the tables together into one giant table if you want.

This model will also have the benefit of being able to extend to an infinite number of years, should you wish to keep doing this. I'm used to MS SQL, but check to see if MySQL has a PIVOT function where you can pivot the "Year" rows into columns - that way you can create a view or return results to your client(s) identical to how it appears on the site.


Ah, great! Thanks, this is helpful.