Skip to main content

Command Palette

Search for a command to run...

FLaNK Stack 26 February 2024

Published
2 min read
FLaNK Stack 26 February 2024

26-February-2024

image

FLaNK Stack Weekly

Tim Spann @PaaSDev

https://pebble.is/PaaSDev

https://vimeo.com/flankstack

https://www.youtube.com/@FLaNK-Stack

https://www.threads.net/@tspannhw

https://medium.com/@tspann/subscribe

https://www.cloudera.com/campaign/apache-nifi-for-dummies.html

https://ossinsight.io/analyze/tspannhw

CODE + COMMUNITY

Please join my meetup group NJ/NYC/Philly/Virtual.

http://www.meetup.com/futureofdata-princeton/

https://www.meetup.com/futureofdata-newyork/

https://www.meetup.com/futureofdata-philadelphia/

image

This is Issue #126

https://github.com/tspannhw/FLiPStackWeekly

https://www.cloudera.com/solutions/dim-developer.html

Articles

Using Google Gemma https://medium.com/@tspann/google-gemma-for-real-time-lightweight-open-llm-inference-88efe98e580f

NYC Traffic?? (NiFi, Kafka, Flink) https://medium.com/@tspann/nyc-traffic-are-you-kidding-me-6d3fa853903b

Subways and Transit Updates in Real-Time https://medium.com/@tspann/subways-and-transit-updates-in-real-time-30c104c359ef

Open Source Data Infrastructure Meetup - Feb 2024 https://medium.com/@tspann/open-source-data-infrastructure-meetup-feb-2024-9e8048666828

https://sap1ens.com/blog/2024/02/18/customizing-flink-class-shadowing/

https://www.projectpro.io/recipes/use-nifi-extract-and-parse-data-from-http-endpoints-and-store-data-persistent-storage

https://engineering.grab.com/attribution-platform

https://amistrongeryet.substack.com/p/why-are-llms-so-gullible

https://huggingface.co/blog/gemma

https://developer.nvidia.com/blog/build-an-llm-powered-data-agent-for-data-analysis/

https://thenewstack.io/the-rise-of-small-language-models/

https://www.infoq.com/news/2024/02/pinterest-pubsub-kafka-flink/

https://www.infoq.com/news/2024/01/doordash-service-mesh/

https://thenewstack.io/demo-use-webassembly-to-run-llms-on-your-own-device-with-wasmedge

https://www.eleuther.ai/releases

https://www.microsoft.com/en-us/research/blog/orca-2-teaching-small-language-models-how-to-reason/

https://www.baeldung.com/ops/docker-remove-dangling-unused-images

AI + More required for startup https://www.nfx.com/post/ai-like-water

https://explainextended.com/2023/12/31/happy-new-year-15/

https://medium.com/sids-tech-cafe/event-driven-systems-lessons-from-the-trenches-107c07b3fc1d

https://materializedview.io/p/from-samza-to-flink-a-decade-of-stream

https://exaspark.medium.com/the-ultimate-guide-to-postgresql-data-change-tracking-c3fa88779572

https://www.wired.com/story/17-tips-better-chatgpt-prompts

https://github.com/microsoft/generative-ai-for-beginners/

Videos

Continuous SQL with Kafka and Flink https://www.youtube.com/watch?v=0Fb8ggZlPrQ&ab_channel=stevecantrell

Building Real-time Pipelines: A Case Study by Transit Data https://www.youtube.com/watch?v=VjmC4J7KZgw&t=2s&ab_channel=Aiven

Unlocking Financial Data with Real-Time Pipelines (OSACon 2023) https://www.youtube.com/watch?v=Q7gF7m4yFi4&ab_channel=OSACon

The Never Landing Stream https://www.youtube.com/watch?v=M8Bp0tRGvV0

https://www.youtube.com/watch?v=gSvvBHBWq20

https://www.youtube.com/watch?v=ayAGiPd2zq4&t=1s

February 8, 2024 NYC Meetup

https://www.slideshare.net/slideshows/ny-open-source-data-meetup-feb-8-2024-building-realtime-pipelines-with-flank-a-case-study-with-transit-data/266227433

February 20, 2024 Virtual Meetup

https://www.slideshare.net/slideshows/dba-fundamentals-group-continuous-sql-with-kafka-and-flink/266403113 https://www.youtube.com/watch?v=0Fb8ggZlPrQ&ab_channel=stevecantrell

Feb 22, 2024 NYC Meetup

https://www.slideshare.net/slideshows/2024-feb-ai-meetup-nyc-genaillmsmldata-codeless-generative-ai-pipelines/266444687

Events

Feb 28, 2024: NYC. Cloudera Meetup. Flink https://www.meetup.com/futureofdata-princeton/events/298661947/

Feb 29, 2024: Virtual. Conf42 Python. https://www.conf42.com/Python_2024_Tim_Spann_apache_nifi_2_processors

https://www.conf42.com/Python_2024_Karin_Wolok_nifi__kafka_risingwave_iceberg_llm

Soon, 2024: Princeton. TigerLabs New Location. Meetup. GenAI. https://www.meetup.com/applied-generative-artificial-intelligence-applications/

March 15, 2024: TCF Pro. Princeton, NJ. IT Professional Conference at Trenton Computer Festival IEEE Information Technology Professional Conference on Friday, March 15th, 2024 https://princetonacm.acm.org/tcfpro/

March 28, 2024: Pinot + NiFi + Flink + Kafka Meetup NYC https://www.meetup.com/real-time-analytics-meetup-ny/events/299290822/

April 2024: XtremeJ 2024. Virtual. https://xtremej.dev/2023/schedule/

April 11, 2024: Conf42 LLM. Virtual. https://www.conf42.com/llms2024

May 8-9, 2024: Data Summit 2024. Boston, MA. https://www.dbta.com/DataSummit/2024/default.aspx

Cloudera Events https://www.cloudera.com/about/events.html

More Events: https://www.linkedin.com/pulse/schedule-2024-tim-spann--y4coe

Code

  • https://github.com/tspannhw/FLaNK-python-watsonx-processor
  • https://github.com/thammuio/doc-genius-ai
  • https://github.com/tspannhw/FLaNK-python-processors

Models

  • https://github.com/ncbi/GeneGPT
  • https://www.arxiv.org/abs/2402.03405
  • https://huggingface.co/foduucom/stockmarket-pattern-detection-yolov8
  • https://github.com/WongKinYiu/yolov9

Tools

  • https://github.com/photopea/photopea
  • https://redash.io/
  • https://lookatme.readthedocs.io/en/latest/getting_started.html
  • https://gist.github.com/johnloy/27dd124ad40e210e91c70dd1c24ac8c8
  • https://prql-lang.org/
  • https://fonts.google.com/selection
  • https://www.kineticedge.io/blog/ktools-kafka-topic-truncate/
  • https://htmx.org/
  • https://deervo.itch.io/diskclick
  • https://leanrada.com/htmz/
  • https://groq.com/
  • https://news.mit.edu/2024/tiny-tamper-proof-id-tag-can-authenticate-almost-anything-0218
  • https://github.com/awslabs/llrt
  • https://observablehq.com/framework/getting-started
  • https://academy.datawrapper.de/article/384-how-to-create-small-multiple-line-charts
  • https://github.com/enjalot/latent-scope
  • https://github.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch
  • https://dmarcchecker.app/
  • https://github.com/gcarmix/HexWalk
  • https://markmap.js.org/repl
  • https://github.com/plantuml/plantuml
  • https://predibase.com/blog/lora-land-fine-tuned-open-source-llms-that-outperform-gpt-4
  • https://github.com/microsoft/UFO
  • https://github.com/datadreamer-dev/datadreamer
  • https://thealliance.ai/news
  • https://engineering.fb.com/2022/03/10/security/code-verify/
  • https://www.sciencedaily.com/releases/2024/02/240216135820.htm
  • https://atuin.sh/
  • https://github.com/simulaiofficial/simulai
  • https://hyperdiv.io/
  • https://github.com/OpenMOSS/AnyGPT
  • https://github.com/Dashibase/lotion
  • https://github.com/microsoft/JARVIS
  • https://github.com/ariya/pico-jarvis
  • https://github.com/ibis-project/ibis
  • https://www.sivalabs.in/langchain4j-ai-services-tutorial/
  • https://github.com/weaviate/weaviate-examples/tree/main
  • https://github.com/weaviate/weaviate-examples/tree/main/clip-multi-modal-text-image-search
  • https://huggingface.co/docs/transformers/model_doc/gptj
  • https://github.com/EleutherAI/gpt-neox/
  • https://github.com/weaviate-tutorials/DEMO-multimodal-search
  • https://github.com/cloudera/CML_llm-hol
  • https://github.com/Mozilla-Ocho/llamafile
  • https://pagescms.org/
  • https://github.com/erfanzar/EasyDeL
  • https://github.com/bots-garden/pi-genai-stack
  • https://spring.io/blog/2024/02/23/spring-ai-0-8-0-released
  • https://github.com/Azure/PyRIT
  • https://github.com/amithkoujalgi/ollama4j
  • https://github.com/dustinblackman/oatmeal
  • https://docs.spring.io/spring-ai/reference/api/clients/ollama-chat.html
  • https://opensource.expediagroup.com/stream-registry/
  • https://github.com/ExpediaGroup/beekeeper
  • https://matklad.github.io/2021/02/06/ARCHITECTURE.md.html
  • https://github.com/Frimkron/mud-pi
  • https://pylint.readthedocs.io/en/latest/pyreverse.html
  • https://electric-sql.com/
  • https://medium.com/hashmapinc/nifi-nar-files-explained-14113f7796fd
  • https://github.com/OpenCodeInterpreter/OpenCodeInterpreter
  • https://github.com/tstack/lnav
  • https://github.com/microsoft/FASTER
  • https://github.com/ok-robot/ok-robot
  • https://github.com/google/gemma.cpp
  • https://github.com/Victormeriqui/Consol3
  • https://github.com/chand1012/sq
  • https://github.com/mukovnin/psfiles

Notable Tools

Postgresql + MySQL Cache https://github.com/readysettech/readyset

NVIDIA GPU LLM https://github.com/NVIDIA/TensorRT-LLM

Configuration Management Server https://caddyserver.com/features

Fast Text to Image https://fastsdxl.ai/

Very Interesting Remote tool for OBS https://vdo.ninja/

Commands Du Jour

docker system prune -a docker image prune -a docker system df docker ps docker logs name

© 2020-2024 Tim Spann

More from this blog

Unstructured Data Unleashed

198 posts

https://github.com/tspannhw/SpeakerProfile

Tim Spann is a Principal Developer Advocate for Zilliz and Milvus. He works with Milvus, Towhee, Attu, GPTCache, Generative AI, HuggingFace, Python, Java, A