The EDW team at Northwestern has developed a custom SQL Server Integration Services (SSIS) component which allows you to extract structured data based on configurable regular expressions. This extendable tool has already been used to implement many data marts which turn free text documentation into discrete information.
This SSIS component has been open-sourced and is available at CodePlex page for RegExtractor.
Related posts: