Coder Social home page Coder Social logo

dalenewman / transformalize Goto Github PK

View Code? Open in Web Editor NEW
156.0 156.0 33.0 173.81 MB

Configurable Extract, Transform, and Load

License: Other

C# 94.36% Batchfile 0.36% HTML 5.10% Dockerfile 0.08% Shell 0.10%
data-warehouse denormalize elasticsearch etl etl-framework excel files mysql postgresql solr sql-server sqlce sqlite ssas

transformalize's Introduction

Hi there ๐Ÿ‘‹

transformalize's People

Contributors

dalenewman avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

transformalize's Issues

Question about documentation

Hello, @dalenewman !
Transformalize looks great, but I faced one problem: I just can't find information about functionality. Are you going to publish some more documentation? It would be great if you could publish detailed information about API (for using Transformalize as Nuget package) or at least about configurating through XML.
Thanks a lot!

Some questions about features

Hi Dale. You solution seems to be most close to my ideas about how ETL should be looked like :)

I have couple of questions:

  1. Are star-schema warehouses supported by Transformalize? When I have many sources loading data into single table?
  2. Is is posstible to handle incremental updates by multiple columns? Is it ok to specify primary-key attribute on multiple columns?
  3. Is it possible to handle time-based incremental updates? For example we can have lot of historical data we'd like to exclude from update, because we're sure it can't be changed. It would be great to have option to add some 'date-of-origin' attribute and ability to specify that we not going to update data before some date?

Thank you.

Multiple unrelated entities

Is there a way to have multiple unrelated entities in the same config file? I have several tables that don't have relationships and when I run the init command it gives this error: You have 2 entities so you need 1 relationships. You have 0 relationships.

mapping

How to mapping input table field to output table field?

Logo contribution

Hello @dalenewman , I am a graphic designer and i like to contribute to open source software. So i designed a logo for transformalize. I tried to represent the function of the application. I used a load, a conversion symbol, and the letter t. Any other ideas? What do you think? I will wait for feedback.

Have a nice day.

trans

Linux?

Is it possible to run this tool on Linux? Perhaps with .net core? I'm interested in gathering data from a local Linux MySQL server, doing some transformations and de-normalization, then pushing it to another Linux server back into MySQL. Thanks!

Getting data out from SQLServer table that has a different schema than dbo

I cannot figure out how to write the right config for getting data out from SQLServer when there is a different schema than dbo.

Here is my current config that I'm trying to use for getting the metadata created. I have no problems getting the metadata created when the table is created in the default "dbo" schema.

<transformalize>
<processes>
    <add name="MyData">
        <connections>
            <add name="input" 
            provider="sqlserver" 
            enabled="true" 
            schema="schemaname" 
            connection-string="Data Source=server; Initial Catalog=databasename; Trusted_Connection=True;" />                
            <!--<add name="output" provider="elasticsearch" server="localhost" port="9200" />-->
        </connections>
        <search-types>
            <add name="default" analyzer="keyword" />
        </search-types>
        <entities>
            <add name="EntityName" />
        </entities> 
        <relationships/>
    </add>
</processes>

How to compile

Hi,
does this project compile? On VS2015 U2 I'm unable to get quite a few dependencies via NuGet as it's not able to find them online:

  • Cfg-NET (required 0.7.4 - found 0.6.7)
  • Cfg-NET.Environment (req. 0.0.2 - found 0.0.1)
  • Cfg-NET.Reader (0.1.1 vs 0.1.0)
  • Cfg-NET.Shorthand (0.0.6 vs 0.0.4)
  • Common.Logging (3.3.1 vs 3.3.0 / Pre 3.3.2-Alpha3)
  • Quartz (2.3.3 vs 2.3.2)
  • System.Data.SQLite.Core (1.0.101 vs not found at all)

Any suggestion / clarification?

Autofac not resolving IEntityDeleteHandler

When trying to set deletes to true for an entity, I get the following error:

Unhandled Exception: Autofac.Core.DependencyResolutionException: An exception was thrown while activating ?:Transformalize.Contracts.IProcessController. ---> Autofac.Core.Registration.ComponentNotRegisteredException: The requested service 'horizon-replicationvw_cds_rdw_mbr_acct_grp_sys (Transformalize.Contracts.IEntityDeleteHandler)' has not been registered. To avoid this exception, either register a component to provide the service, check for service registration using IsRegistered(), or use the ResolveOptional() method to resolve an optional dependency.
   at Autofac.ResolutionExtensions.ResolveService(IComponentContext context, Service service, IEnumerable`1 parameters)
   at Autofac.ResolutionExtensions.ResolveNamed[TService](IComponentContext context, String serviceName, IEnumerable`1 parameters)
   at Transformalize.Ioc.Autofac.Modules.ProcessControlModule.<Load>b__3_0(IComponentContext ctx)
   at Autofac.Builder.RegistrationBuilder.<>c__DisplayClass0_0`1.<ForDelegate>b__0(IComponentContext c, IEnumerable`1 p)
   at Autofac.Core.Activators.Delegate.DelegateActivator.ActivateInstance(IComponentContext context, IEnumerable`1 parameters)
   at Autofac.Core.Resolving.InstanceLookup.Activate(IEnumerable`1 parameters, Object& decoratorTarget)
   --- End of inner exception stack trace ---
   at Autofac.Core.Resolving.InstanceLookup.Activate(IEnumerable`1 parameters, Object& decoratorTarget)
   at Autofac.Core.Resolving.InstanceLookup.Execute()
   at Autofac.Core.Resolving.ResolveOperation.GetOrCreateInstance(ISharingLifetimeScope currentOperationScope, IComponentRegistration registration, IEnumerable`1 parameters)
   at Autofac.Core.Resolving.ResolveOperation.Execute(IComponentRegistration registration, IEnumerable`1 parameters)
   at Autofac.ResolutionExtensions.TryResolveService(IComponentContext context, Service service, IEnumerable`1 parameters, Object& instance)
   at Autofac.ResolutionExtensions.ResolveService(IComponentContext context, Service service, IEnumerable`1 parameters)
   at Autofac.ResolutionExtensions.Resolve[TService](IComponentContext context, IEnumerable`1 parameters)
   at Transformalize.Command.NowScheduler.Start()
   at Transformalize.Command.Program.Main(String[] args)

Can't find TFL.exe

Hi,

Great project, but where is TFL.exe located?
Also, building Pipeline.Command crashes the CSC due to IOException.

Thanks.

How to detect Header DataTypes during File Parsing

in my case, the source CSV data file and header content/types change often -- basically unknown, while parsing or before parsing I need to first Discover the Data Types of the data in the cols.

I just need to load & save the data into a Sql table based on whatever data & DataTypes there are in the CSV file. Can I auto load the CSV data and datatypes into the DB automatically with ETL box. I saw your expando example but was a little lost how to achieve a high level task

For e.g high level algorithm, how would I achieve this in your lib.

// Scan first 9 rows to discover the datatypes in the CSV file cols
var etlPrasedFile = DaleNewMan.Trans.Open/parseCSV(someUnknownData.csv).ScanRows(9)
foreach (var col in etlPrasedFile )
new dataTable.addColumn(etlParsedFile.GetNextHeaderCol)
CreateNewSqlServerTable
LoadDatatoSqlTable??

thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.