From 39fe666fc7d15307c35183dd0ae40570cf743a81 Mon Sep 17 00:00:00 2001 From: Jan Margeta Date: Mon, 22 Oct 2018 22:22:25 +0200 Subject: [PATCH] Add Ray to Cluster computing # What is this Python project? Ray is a flexible, high-performance distributed execution framework. It achieves parallelism in Python with simple and consistent API. Ray is particularly suited for machine learning and forms the base of libraries for deep and reinforcement learning, distributing processing of Pandas dataframes, or hyper parameter search. # What's the difference between this Python project and similar ones? - Similar to Dask, see a comparison here: https://github.com/ray-project/ray/issues/642 - Allows to efficiently share large numpy arrays (or objects serializable with Arrow) between the processes, without copying the data and with only minimal deserialization - Achieves lower latency with bottom up scheduling --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index a025514..6366033 100644 --- a/README.md +++ b/README.md @@ -248,6 +248,7 @@ Inspired by [awesome-php](https://github.com/ziadoz/awesome-php). * [faust](https://github.com/robinhood/faust) - A stream processing library, porting the ideas from [Kafka Streams](https://kafka.apache.org/documentation/streams/) to Python. * [luigi](https://github.com/spotify/luigi) - A module that helps you build complex pipelines of batch jobs. * [mrjob](https://github.com/Yelp/mrjob) - Run MapReduce jobs on Hadoop or Amazon Web Services. +* [Ray](https://github.com/ray-project/ray/) - A system for parallel and distributed Python that unifies the machine learning ecosystem. * [streamparse](https://github.com/Parsely/streamparse) - Run Python code against real-time streams of data via [Apache Storm](http://storm.apache.org/). ## Code Analysis