Friday, December 7, 2007

The Myth of Dynamic Typing?

My last post has also got me thinking about a different idea (dynamic programming paradigms), and I have begun to think about this topic as programming languages versus scripting languages. I personally prefer programming languages, but I can't deny the power of scripting languages. I've often heard that the power of scripting languages comes from dynamic typing and the ability to give any variable of any type to a function and let it do it's thing. But I think this is flawed logic, because at some point the type becomes fixed and the appropriate algorithm must be used to handle this data. So dynamic typing is just a convenience for the programmer that abstracts away a piece of the truth. And now you have to ask yourself, "Why couldn't this same mechanism be done with templates, functional programming, and other mechanisms in a programming language?" The correct answer is that there's no reason it couldn't be.

I think that the problem is not that the programming languages are compiled, statically typed, or any of the other non-sense that you always hear, but the problem is simply what we expect from them. We expect a language like C/C++ to be low-level, operating system agnostic, and portable without any of the "mess" of GUIs, rendering capabilities, etc. But we expect a language like MATLAB/Python to be high-level, operating system agnostic, and portable with all of the "power" of GUIs, rendering capabilities, etc. Each of these paradigms definitely has it's own pros/cons, but I just think that the divide that exists between the two doesn't necessarily have to be as big as it is right now.

This also brings up the real topic that I am actually interested in, "Why can't databases (and their results) be statically-typed?" Why can't I do something like this:

struct Transaction
{
unsigned int id; //Primary Key
Date date;
DecimalValue amount;
std::string description;
};

struct BudgetCategory
{
unsigned int id; //Primary Key
std::string name;
DecimalValue budgeted_spending;
PaymentType frequency;
};

struct Transaction_BudgetCategory
{
unsigned int transaction_id; //Foreign Key(Transaction.id)
unsigned int budgetcategory_id; //Foreign Key(BudgetCategory.id)
};

auto_type category_sums =
SELECT BudgetCategory.name, sum(Transaction.amount)
FROM Transaction_BudgetCategory, Transaction
WHERE Transaction_BudgetCategory.id == Transaction.id
GROUP BY Transaction.id;

Obviously I'm talking about a whole new language/paradigm, but why couldn't you do something like that? The compiler could figure out the appropriate types and create the appropriate structure for you. I know that this is kind of what LINQ is trying to do, but my understanding is that with LINQ you have to create your own structures to store the results in, and why can't the compiler just do it for you?

Maybe I'm just dreaming or not seeing the drawbacks to something like this, but I still think it'd be cool.

Super Global Variables

I've been working on a program to manage my budget for the last few months and when I first started out I had it store all of the data in XML files which were loaded into well organized classes. It was pretty simple to get up and running and was working really well until I started to want to "link" some of the values together (transactions to budget categories) and do some statistical analysis type of things. I definitely could have added support for storing these links and doing these types of "queries" in my classes, but it just seemed like reinventing the wheel because databases could do exactly what I wanted and more.

So I decided to start playing around with using a database. SQLite was amazingly simple to get up and running, and despite a few complaints (no foreign key support and limited ALTER TABLE support which fortunately have solutions (foreign keys through triggers and work around for removing columns)) it got the job done with less hassle than I expected. I definitely had to brush up on my SQL since I hadn't written a line of it in years, but for the most part it's a pretty simple language and I was linking and querying the data in no time.

But this whole thing got me wondering, "this ease of use has to have an explanation and most likely an associated drawback or cost". After thinking about this for a while, I came to the conclusion that databases are basically just "Super Global Variables". They're "global" because they're a nebulous set of data that's accessible from any part of a program, and they're "super"
because you can grab them in various forms/combinations. When I first realized this, I kind of chuckled to myself, because programmers always talk about how global variables (just a super set of static variables) are evil/problematic but they'll use a database without even thinking about it. It just made me realize how sometimes our own abstractions actually hide us from the ugly truth that we'd probably rather ignore.

Wednesday, December 5, 2007

Memory Leak Detection with Visual Studio 2008

Brad Fish told me that Visual Studio 2008 was definitely worth the upgrade (especially on Vista), so I took the plunge. For a C++ programmer, I don't think it was as cool of a jump as 2003 to 2005 and not even close to the HUGE jump from 6 to 2003 (I skipped 2002), but it's still been a very nice upgrade.

But anyway, I have been working on making my own little SQLite3 C++ wrapper, because I just didn't like the ones that I could find out there. While doing that I also decided that it would be easiest to use a smart pointer to manage all of the pointers to SQLite stuff. I had played around with the Boost shared_ptr, but once again I just didn't like some of the syntax (mostly the custom deleter being in the constructor rather than a template parameter), so I just dusted off some old shared pointer code of my own and added a custom deleter to it.

It all seemed to be working just fine, but I just wanted to make sure that I was cleaning everything up properly and then I remembered that the current versions of Visual Studio don't dump out the memory leaks like Visual Studio 6 used to do by default. So I started some Googling and found this page. It says that it doesn't work with Express Edition, but it must be a typo or something, because it works (just not quite as described). Basically, here's what I was able to figure out from playing around with it. All you need to do is:
1) Add the header crtdbg.h
2) Call _CrtSetDbgFlag ( _CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF );

You can find out more about _CrtSetDbgFlag() here, but calling it this way this will turn on the printing out of the memory leaks when your program exits (only when _DEBUG is defined).

After creating some artifical memory leaks, I noticed that it didn't dump the line that the memory was allocated on like it used to. So I did a little scavenging through the crtdbg.h header file and I found a conditional compile that would enable the printing of the allocation line. I added the two conditions to my code:
#define _CRTDBG_MAP_ALLOC
#define _CRTDBG_MAP_ALLOC_NEW
But unfortunately, this is just a mechanism to allow old code to still compile, because now it just prints the line in the crtdbg.h file. This is obviously useless (something that's I just noticed is also pointed out in the comments of the crtdbg.h file), so I guess this is the first thing I've found in the newer versions of Visual Studio that just aren't up to par with Visual Studio 6 (I know I didn't think I'd ever say/hear/read that either).

I guess that I should just be happy that I was able to get my memory leak dump again, but I'm still wondering why that's not on by default.