This is a very simple proof-of-concept which, with community collaboration, could easily form the basis of much more efficient Arc execution. If you can see value in this approach and would like to get involved please raise an issue. If sufficient demand is reached we can set up a more formal discussion forum.
How to run
Clone the repository
This respository has a submodule with the TPC-H data in it for easy execution demonstration. So when cloning add the recusive capability:
git clone --recurse-submodules https://github.com/tripl-ai/box.git
To execute a job via the command line you can use the the provided
./box.sh file which will execute
job.json and is intended to show the basic functionality.
You will need to have Rust installed (see rustup) and then add the
rustup toolchain install nightly
after the initial Rust install. The Rust
nightly version is currently required for the
simd support. Some packages may need to be install to compile such as
cmake but if you check the build output it should indicate any missing packages.
Please note that if running on WSL or Windows you may need to convert the line endings to Unix format (LF) in order to run the script. When checking out the code they may be automatically changed to Windows line endings (
CRLF) depending on your config. If you would like to git to not convert
CRLF line endings then you can set core.autocrlf to false:
git config --global core.autocrlf false
See Customizing Git for more information.
To execute the notebook functionality execute the provided
./notebook.sh file. The
box.ipynb file is a demonstration and is intended to show the basic notebook functionality. You will need Docker installed (see Docker).
The notebook functionality relies on code copied and modified from the evcxr crate.