Loading BigQuery data into Neo4j
Matt Casters on Data Integration & Graphs
by admin
4y ago
Hi Kettle & Neo4j fans, One of my colleagues asked how to read from Google BigQuery using Kettle and then do load the data into Neo4j. The BigQuery connection type is listed so it should be fairly straightforward but it’s worth mentioning how to do it here… First thing you need to do is download the JDBC driver since it’s not included in your Kettle download. Go to the Simba JDBC driver downlaod website and get it under the JDBC 4.2-compatible label. From zip files you copy all the libraries (.jar files) to your Kettle distribution under the lib/ folder… except the commons-* and ..read more
Visit website
Kettle Community Meetup 2019
Matt Casters on Data Integration & Graphs
by admin
4y ago
Dear Kettle friends, Great news! We’re hosting a Kettle Community Meetup (#KCM19) in Antwerp this year. Please visit our wonderful landing page for more information: http://landing.know.bi/kettle-community-meeting-2019-0 The wonderful location of the 2019 Kettle Community Meetup The name of the meetup previously was “Pentaho Community Meetup” (PCM). The re-branding as KCM was not done for any negative reasons towards Pentaho or Hitachi Vantara but specifically because the overwhelming subject of the last number of PCM editions were about Kettle anyway. Kettle as a community project ..read more
Visit website
True diversity
Matt Casters on Data Integration & Graphs
by admin
4y ago
When I asked on Twitter what “true diversity” means I obviously meant … beyond the obvious notion that we shouldn’t care about race, religion, ethnicity and so on. You see, when I first came in contact with open source it was pretty commonplace to be as rude as you could possibly be if someone didn’t use the right tone on a mailing list or, heaven forbid, made a mistake in the code.  Unit testing wasn’t a thing back then you see. It wasn’t just Linus either in case you wonder and for a while I was proud of the nickname they gave me (Soup Nazi after the Seinfeld episode with the same name ..read more
Visit website
Kettle Beam update 0.5.0
Matt Casters on Data Integration & Graphs
by admin
4y ago
Dear Kettle friends, As you may or may not know integration with Apache Beam has been slowly but steadily moving along. Apache Beam in short: Implement batch and streaming data processing jobs that run on any execution engine. quoting beam.apache.org So for a tool like Kettle this is something very enticing. If we can make a Kettle transformation run on one of the execution engines, we can run it on all of them. This is what the Kettle Beam project is all about. For an overview of this, check out this presentation I did on the subject: http://beam.kettle.be And the video of ..read more
Visit website
Calculate unique values in parallel for Neo4j Node creation
Matt Casters on Data Integration & Graphs
by admin
4y ago
Hi Kettle and Neo4j fans! So maybe that title is a little bit over the top but it sums up what this transformation does: Here is what we do in this transformation: Read a text file Calculate unique values over 2 columns Create Neo4j nodes for each unique value To do this we first normalize the columns effectively doubling the amount of rows in the set.  Then we do some cleanup (remove double quotes).  The secret sauce is then to do a partitioned unique value calculation (5 partitions means 5 parallel thread). By partitioning the data on the single column we guarantee that the same data ends u ..read more
Visit website
Catching up with Kettle REMIX
Matt Casters on Data Integration & Graphs
by admin
4y ago
Dear Kettle and Neo4j friends, Since I joined the Neo4j team in April I haven’t given you any updates despite the fact that a lot of activity has been taking place in both the Neo4j and Kettle realms. First and foremost, you can grab the cool Neo4j plugins from neo4j.kettle.be (the plugin in the marketplace is always out of date since it takes weeks to update the metadata). Then based on valuable feedback from community members we’ve updated the DataSet plugin (including unit testing) to include relative paths for filenames (for easier git support), to avoid modifying transformation metadata a ..read more
Visit website
Farewell Pentaho
Matt Casters on Data Integration & Graphs
by admin
4y ago
Dear Kettle friends, 12 years ago I joined a wonderful team of people at Pentaho who thought they could make a real change in the world of business analytics. At that point I recently open sourced my own data integration tool (then still called ‘ETL’) called Kettle and so I joined in the role of Chief Architect of Data Integration. The title sounded great and the job included everything from writing articles (and a book), massive amounts of coding, testing, software releases, giving support, doing training, workshops, … In other words, life was simply doing everything I possibly and impossibly ..read more
Visit website
Pentaho Community Meetup 2016 recap
Matt Casters on Data Integration & Graphs
by admin
4y ago
Dear friends, I just came back from PCM16, the 9th annual edition of our European Pentaho Community Meetup.  We had close to 200 subscriptions for this event of which about 150 showed up making this the biggest so far.  Even though veterans of the conference like myself really appreciate the warmth in previous locations like Barcelona and Cascais, I have to admit we did get a great venue in Antwerp this year with 2 large rooms, great catering, top notch audiovisual support in a nice part in the city center. (Free high speed Antwerp city WiFi, Yeah!) Content-wise everything was more than OK wit ..read more
Visit website
Dynamic and Scalable : Pentaho 6.1 has arrived!
Matt Casters on Data Integration & Graphs
by admin
4y ago
Hello Kettle and Pentaho fans! Yes indeed we’ve got another present for you in the form of a new Pentaho release: version 6.1 This predictable steady flow of releases has in my opinion pushed the popularity of PDI/Kettle over the years so it’s great that we manage to keep this up. The image above shows the evolution of PDI download counts over the years on SourceForge only.   There’s actually a ton of really nice stuff to be found in version 6.1 so for a more complete recap I’m going to refer to my friend and PDI product manager Jens on his blog. However, there are a few favorite PDI topics I ..read more
Visit website

Follow Matt Casters on Data Integration & Graphs on FeedSpot

Continue with Google
Continue with Apple
OR