Alex Holmes @grep_alex
Software engineer, author (Hadoop in Practice - http://t.co/RmStoT4oQS) and blogger (http://t.co/5VOKHW1tps). grepalex.com Joined April 2007-
Tweets400
-
Followers541
-
Following167
-
Likes19
To commemorate #juneteenth, my org @🍎 is taking a day off to read on racial justice and reflect on how we can make this society a fair place for all. I’ll keep reading “So you want to talk about race” and start “White Fragility”. What are you, fellow white techie, going to do?
Fighting big data problems at scale? Attend my @OracleCodeOne session: oracle.rainfocus.com/widget/oracle/… #ShowUsYourCode #CodeOne
Shaded Hadoop client artifacts for the win coming with Hadoop 3! issues.apache.org/jira/browse/HA…
been meaning to write this for a while: configuring memory for MapReduce under YARN grepalex.com/2016/12/07/map…
wow, if you’re writing Scala using IntelliJ you should switch to CE - significantly faster than UE
Kafka now has time-based indexes to allow seeking to a particular point in time cwiki.apache.org/confluence/dis…
at #icml and #icml2016 and looking for a challenging and rewarding data science gig? check out jobs.apple.com/us/search?#&ss…
Twitter's Storm rewrite Heron is now open source
bumped into this with a large Spark job issues.apache.org/jira/browse/SP… fix was upping driver memory, shuffling with 2001 partitions
a must-read talk by @martinkl on stream processing patterns martin.kleppmann.com/2015/10/30/str…
some solid Spark config tips here slideshare.net/secret/IM4HsXA… - thanks @Evanfchan!
I'm hiring - send me a DM if you're an engineer interested in working on tough Internet-scale problems
big thanks to everyone that attended my @JavaOneConf talk today - here are the slides slideshare.net/grepalex/avoid…
Announcing Trumpet — Scalable INotify for HDFS github.com/verisign/trump… #Hadoop #HDFS #OSS /cc @VERISIGN
Slides are up: 'Apache Kafka Reliability Guarantees StrataHadoop NYC 2015' slideshare.net/jhols1/apache-… CC @gwenshap
this could be very interesting: incubating Apache Geode in-memory distributed database geode.incubator.apache.org
wow, super-cool … RT @brocknoland: Stop looking at code on Github and checkout CodeAtlas codatlas.com
and mapreduce.job.user.classpath.first=true in your Hadoop config if you want the same in your map/reduce container JVM's

Gwen (Chen) Shapira @gwenshap
28K Followers 10K Following Co-founder of @niledatabase. Making SaaS global, elastic and chill. Find me at: https://t.co/uyuHg400cp
Ananth Packkildurai @ananthdurai
3K Followers 2K Following Data @Zendesk, @SlackHQ | Author https://t.co/rvlBOXX0cy | Creator of https://t.co/XdMVrxUay6 | Angel Investor | Advisor for early stage data startups
Evelyn gordon @evelyngordon380
90 Followers 496 Following
Alberto Vallejo @avallejo_ai
910 Followers 2K Following Not just another Data Architect. BigData AI ML DWH lakehouse cloud DataMining NLP TextMining Analytics SQL search solr spark Azure AWS SnowflakeDB dbt ♘
Martin Lhotsky @JakinCz
53 Followers 2K Following
Jaikishan @jkhatwani
9 Followers 282 Following
Vinod Gupta @BeingVinodGupta
24 Followers 188 Following
ατⅿσδαⅿαη @atmosaman
5 Followers 150 Following
Josh Langner @joshlangner
323 Followers 811 Following I build workspaces on the Web. Previously NVIDIA, Mandiant, FireEye, Accenture Security, Verisign
kgpai @kgpai
107 Followers 436 Following
Mohamed Ahdab @mahdab1
7 Followers 43 Following
DataEngConf @dataengconf
1K Followers 4K Following We are now Data Council - This account is inactive - Please follow @datacouncilai | https://t.co/H26FTLfUmz
Kat Fernandez @katfernandez5
61 Followers 1K Following
Ryan Simons 🇯🇲 ... @ryansimonsjm
274 Followers 3K Following CEO & Founder DenArthur Analytics https://t.co/UPoENj17xs Lover of all things Data and Tech 💻📊 Motorsports fanatic 🏍️ Amateur cyclist 🚲 Lifelong Student
Jörn @joernkottmann
38 Followers 122 Following
Binh Nguyen Thanh @binhunix
24 Followers 317 Followingashley khanom @ashleykhanom
14 Followers 309 Following
ming_tian @040840219
4 Followers 98 Following
jack shani @jackshani2u
36 Followers 85 Following
Naomi Colnaghi @naomi_colnaghi
61 Followers 113 Following Hi, I´m Italian and live in Germany since 2015. I help IT consultants in finding new exciting (inter)national projects and to upgrade their SAP and SFDC skills!
Zhuoteng Huang @huangzht
32 Followers 609 Following
Click IT @clickitnz
1 Followers 25 Following
jueqingsizhe66 @jueqingsizhe66
8 Followers 318 Following
Manfred Weber @manfred_weber_
50 Followers 359 Following System Architect | Big Data. Fast Data. Event Processing. Data Science. Machine Learning. AI.
Chowkidar gvrreddy @gvrreddy1
20 Followers 24 Following
Clevered @BeClevered
1K Followers 2K Following Lifelong learning in emerging technologies like Data Science, Machine Learning and AI with mentorship and career design
baselogic @baselogic
3K Followers 3K Following Author "Spring Security 3rd Edition (Packt)", Java, JavaEE, Spring, Spring Security, Tomcat, MongoDB, Maven, Gradle Architect, Designer, Instructor, Evangelist
saurabh pandit @panditsaurabh
63 Followers 400 Following
Maya Nair @mayasn09
496 Followers 677 Following Research interest: Predictive Analytics, Transfer Learning, Data science for good, Time series analysis, Traffic
Ranaivoson @air_manitra
517 Followers 4K Following
Ganesh Raju @ganeshraju
404 Followers 3K Following Tech Lead, Big Data and Datascience @Linaro #BigTop, #BigData, #Datascience, #ODPi, #MachineLearning, #AI, #Cloud, #Edge, #IoT, #ARM, #AArch64 #capoeirista #ev
abhiram @abhirj87
89 Followers 922 Following
DataOps Summit @DataOpsSummit
397 Followers 530 Following DataOps Summit 2017 | Boston, MA USA | November 2017 | https://t.co/BZBn6eAk5o
Amin Abbaspour @aminize
421 Followers 558 Following craftsman @auth0, read books, lift weights, help others (opinions mine)
Java Day Istanbul @javadayistanbul
3K Followers 2K Following Developer Conference 18 April, 2026. Powered by @jug_istanbul
Jowanza Joseph @Jowanza
3K Followers 2K Following Founder/CEO @complyparakeet. The General Ledger for Industrial Risk.
Mario G. @mariopiogioiosa
238 Followers 642 Following Software engineer. @TicinoSWCraft co-founder. Runner. Drummer. Dreamer. Nothing great in the world has ever been accomplished without passion (Hegel).
Dhananjay Gurav @DAGurav
18 Followers 286 Following
Eric Sammer @esammer
12K Followers 686 Following ceo at @decodableco! prev: @splunk, @rocanainc (acq'd), @cloudera. open source / dist systems / data. o'reilly author. [email protected]
Gwen (Chen) Shapira @gwenshap
28K Followers 10K Following Co-founder of @niledatabase. Making SaaS global, elastic and chill. Find me at: https://t.co/uyuHg400cp
Dmitriy Ryaboy 🇺�... @squarecog
9K Followers 1K Following VP Eng. Recovering data engineer. Co-author of @MissingReadme. Plays with swords. Helped build this stupid place, long ago.
Joey (veri-fied) @fwiffo
1K Followers 601 Following Software Engineer at Netflix. I like to solve problems. Board games are the best. Burma-shave. Pronouns: he/him
Natty @nattyice
1K Followers 364 Following I like data of all shapes and sizes. Solutioning at @dbt_labs. The less impressive half of the couple.
Julien Le Dem @J_
4K Followers 2K Following Architect, Founder, Angel, Advisor, OSS: @OpenLineage @MarquezProject, ASF: Parquet Arrow Iceberg 🐖. 🦋 https://t.co/4VQUXaZ5vu . he/him
Pete Skomoroch @peteskomoroch
51K Followers 8K Following Investor and AI startup founder. Focus: AI, LLMs, LifeOps, AI Product Management. Was founder @SkipFlag. EIR @Accel. Data Science & ML @LinkedIn, @AOL & @MIT
Michelle Wolf @michelleisawolf
417K Followers 389 Following pre-order Nice Lady album: https://t.co/jJ8mDBSE7v
Anna Smith @asmitholmes
65 Followers 125 Following Mother, wife, phlebotomist and manager. Passionate about listening to people and trying to make things better! All views my own, RTs not endorsements
San José Strong @SanJoseStrong1
570 Followers 82 Following we’re active on IG vs. here! we’re a volunteer-ran org that connects residents to resources in South Bay & creates in-house initiatives that build community🌻
holden karau @holdenkarau
16K Followers 2K Following she/her, OSS Big Data. ❤️🛵 ☕️ spark. I don't represent my employer. Live @ https://t.co/uOyeZtBXx0 , https://t.co/GB3Ok0vbVA
Rachel Cleetus @RachelCleetus
448 Followers 6 Following Policy director, Climate & Energy Program @ucsusa. Pursuing ambitious, just and equitable solutions to address climate change. Views are my own.
Mark Grover @mark_grover
1K Followers 594 Following Founder at Stemma, co-creator of Amundsen, Author.
Ritwik @slooowmotion
33 Followers 15 Following
Martin Kleppmann @martinkl
49K Followers 949 Following Find me at @martin.kleppmann.com on Bluesky, @[email protected] on Mastodon. Author of @intensivedata, Associate Professor @Cambridge_CL. he/him
Vijay Parthasarathy @vijay2win
801 Followers 231 Following Head of AI at Zoom, was at Facebook, Apple, Netflix, WebEx, Apache Cassandra Committer
Tyler Akidau @takidau
3K Followers 167 Following CTO @redpandadata. @streamingbook author. Ex-Snowflake, Xoogler. Owner of opinions. Something about whiskey. Punk, hardcore, black metal, smooth jazz. He/Him.
Jameel Syed @tilapia
644 Followers 2K Following
Sandy Ryza @s_ryz
2K Followers 594 Following Lead the @dagster project. Wrote Advanced Analytics with Spark.
Ben Bernanke @benbernanke
108K Followers 2 Following Author of The Courage to Act, now available in paperback: https://t.co/MAa3VQsC47. Former Fed Chair; Distinguished Fellow in Residence, @BrookingsInst.
Apache Kafka @apachekafka
68K Followers 236 Following A distributed streaming platform. Account managed by the Kafka PMC.
Irena Shaigorodsky @ishaigorodsky
30 Followers 24 Following
Community Over Code @ApacheCon
13K Followers 2K Following The events of the projects of The ASF. https://t.co/DwEDKpK1Ko #CommunityOverCode
Rocana @rocanainc
1K Followers 1K Following Splunk will leverage Rocana’s tech & team to advance its market-leading #machinedata platform & #machinelearning capabilities. Follow @splunk moving forward.
JP @johanzoidberg
236 Followers 1K Following Disappointed 'palindrome' isn't one. Also a group of squid isn't a squad.
Apache Drill @ApacheDrill
5K Followers 94 Following Open source MPP query engine inspired by Google’s Dremel. Ad hoc SQL interactive queries and data exploration on massive scale data. 1.20 download available!
Apache Impala @ApacheImpala
3K Followers 17 Following The Modern, Open Source MPP SQL Query Engine for Apache Hadoop, Apache HBase, Apache Kudu, Cloud Object Stores and More.
Apache Parquet @ApacheParquet
9K Followers 26 Following Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval. It provides high performance compression
Suneet Nandwani @suneetn
189 Followers 345 Following Parent, Option trader, Builds Clouds, Golfer, Opinionated, ex Meta, ex eBay, ex Symantec
Swapnil Salunkhe @swapnils10
115 Followers 393 Following Big Data Enthusiast..!!! LinkedIn Page: http://t.co/lfxu3IDYvg
Mike Percy @mike_percy
736 Followers 717 Following Core Data Datastores team at @Meta. Formerly, @Cloudera and @Yahoo. Apache Kudu and Apache Flume PMC member. I don't speak for my employer.
Adam Kawa @adam_kawa
548 Followers 221 Following Co-founder and CEO at @GetInData (https://t.co/VFdoyuhUWB). Experienced and passionate Big Data consultant (previously at Spotify).
MMO @barkbay
101 Followers 641 Following Java/Scala/Spark developer - Traveler - Elephant Tamer - Container Carrier - Linuxian since 1996
William Vambenepe @vambenepe
8K Followers 2K Following Back in the Cloud/analytics space after a 5-year parenthesis working on consumer products (Google Search/News). Now at AWS (not speaking for my employer).
enissoz @enissoz
678 Followers 263 Following Apache HBase and Hadoop committer. HBase dev at Hortonworks.
Jeremy Karn @jeremykarn
135 Followers 9 Following
Edgar Meij @edgarmeij
2K Followers 998 Following Head of AI Platforms in Bloomberg AI Engineering, including data, machine learning, search, and LLM platforms used across the company. Views are mine.
Christian Tzolov🇧�... @christzolov
1K Followers 426 Following R&D Software Engineer at @Broadcom's @SpringCentral team | #SpringAI lead | MCP Java SDK founder | @TheASF Committer
Hemanth Yamijala @yhemanth
392 Followers 185 Following Soroco. Ex-Cloudera. Data and distributed systems. Figuring out product and scaling through leadership. Carnatic music. Painting.