A community improved version of the polycubes project!

Michael Pound

Last update: Jul 26, 2023

Related tags

Command-line opencubes

Overview

Polycubes

This code is associated with the Computerphile video on generating polycubes. The original repository may be found here. That version is unchanged from my original video, so that those watching for the first time can find and use the original code, and make improvements to it themselves. This repository is for those looking to contribute to a faster and better optimised version, driven by improvements from Computerphile viewers!

Introduction

A polycube is a set of cubes in any configuration in which all cubes are orthogonally connected - share a face. This code calculates all the variations of 3D polycubes for any size (time permitting!).

This repository contains 3 solutions written in 3 languages Python, C++, and Rust. each sub folder contains a README with instructions on how to run them.

Improving the code

This repo already has some improvements included, and will happily accept more via pull request. Some things you might think about:

The main limiting factor at this time seems to be memory usage, at n=14+ you need hundereds of GB's just to store the cubes, so keeping them all in main memory gets dificult.
Distributing the computation across many systems would allow us to scale horizontally rather than vertically, but it opens questions of how to do so without each system having a full copy of all the cubes, and how to manage the large quantities of data.
Calculating 24 rotations of a cube is slow, the only way to avoid this would be to come up with some rotationally invariant way of comparing cubes. I've not thought of one yet!

Contributing!

This version welcomes contributors!

References

Wikipedia article
This repository was a source of inspiration, and a great description of some possible ways to solve this.
There may be better ways to count these, but I've not explored in much detail.
Kevin Gong's webpage on enumerating all shapes up to n=16.

Comments

skip global hashtable (N=15 in 5h13)

Based on previous patch set. Uses my understanding of this algorithm https://github.com/mikepound/opencubes/issues/11 by @presseyt Some nasty implementation details still (N=16 max because still using point list datatype for quicker implementation, lots of rotatation and deduplication all over the place).

Essentially perfectly parallelisabe as cores take a sub polycube of order n and return the number of polycubes that are canonical ancestors.

A bit slow (around 3x slower than before) for determining what is and isnt a canonical child of the polycube currently being explored.

opened by NailLegProcessorDivide 23
C++ implementation

This implementation is completely separate from the python implementation. It uses simpler representation than RLE, by just using a list of the indices of all ones (like .nonzero() in numpy).

opened by nsch0e 23
Optimize Rotations::rotate() more
Bunch of small optimizations I made:

Hoist nearly all allocations out from the hot loop of checking the rotations.

Add out argument as Rotations::rotate(...,Cube&) and make it return std::pair<XYZ,bool> the second==true if rotation is valid. This avoids allocating memory every time in rotate().

Update/Fix test_rotations.cpp to work with changed rotate()

Cube::empty_from() added to clear cube and reserve enough space.

clang-format the sources

Misc fixups to the cubes code.

this: N = 12 || generating new cubes from 2522522 base cubes. 66 sets by shape for N=12 converting to vector starting 2 threads done took 98.79 s [ 0, 1261261] done took 114.17 s [1261261, 2522522] num cubes: 18598427 ./build/cubes -n 12 -t 2 245.05s user 1.88s system 180% cpu 2:16.76 total versus: N = 12 || generating new cubes from 2522522 base cubes. 66 sets by shape for N=12 converting to vector starting 2 threads done took 119.84 s [ 0, 1261261] done took 139.10 s [1261261, 2522522] num cubes: 18598427 ./build/cubes -n 12 -t 2 297.87s user 2.04s system 180% cpu 2:46.07 total

N=12 is is biggest that I can reasonably test and most of time is still spent sorting the Cube rotations...

My idea was that less we do memory allocations, the better multi-threading will scale, since they are not stuck in the memory allocator. I have follow up changes that require this patch in-order to reduce the allocations even further.
opened by JATothrim 22
Another rust implementation
I saw this as a good opportunity to try and do some math-related stuff with Rust :D

I had missed #6 before I got started on it and got kind of carried away.

It does pretty much the same thing as the python implementation, but a bit faster, and for now without RLEs.

Also, since we're solving for a known amount of dimensions (3), I just went with a fixed-size array implementation. This also semi-forced me to implement transpositions and such, which was a good challenge (note: axis[0] != a1, raah)

So far it seems to do what it should, yielding the correct numbers for up to n=13 (n=13 calculation took 2 hours and 10 minutes on a Xeon E5-2630v3, haven't kept track of memory usage).

Possible improvements:

[x] Parallelize computation (using rayon. Note: impl might be faster by collect-ing on sublists, but that eats up a lot of memory)

[x] Add serde + load and save from cache

[ ] Add RLE (for completeness)

[ ] Somehow add something like Cow to the mix and avoid cloning for flip and transpose operations that don't need it

[x] ~~Maybe: add interop with numpy format (crude impl suffices)~~ Won't do, pcube format is easier
opened by datdenkikniet 15
more performant rust implementations.
[x] Figure out what to do with the readme, currently just concat

[x] Implement polycube converters between reps and pass iterators for cache files to enumeration functions

[x] Support exporting to cache files from new implementations

[x] Give feedback to show progress running computation

Done: Add -m mode flag to determine enumeration algo Support -p parallel disable flag for point-list mode

Create polycube_reps file which stores poly cube representations used my my code, I dont Like that its a separate file from existing PolyCube rep but I cant think of how to group it more cleanly
opened by NailLegProcessorDivide 12
Code Overhaul.

implements improvements from other repo from myself and ana-096.. refactors the code into multiple files. adds .gitignore. adds initial unit tests. adds parallelism, by implementing a single unique hash for each shape by taking the largest value (in terms of the unsinged integer value of the bit array) amongst hashes for each rotation. makes rendering a command line flag.

looking for feedback on all of this, whether we want to merge it as is, cherry pick certain parts, or ignore this as rubbish, I think at least the refactor and unit tests are desperately needed if we are to work on this in parallel.

please note this is an opportunity to review and consider for the whole community, anyone feel free take a look and leave feedback if you have time even if you wont necessarily be working on it.

finally, this is leading towards a distributed system for managing this computation, since each job slice doesn't need a full hash set of all existing cubes to do processing, the current calls to multiprocessing could all be replaced with net calls to a matching client for massive scale.

opened by bertie2 8
Rust disk-only computation test
This is on top of #21.

Seeing if getting the memory size down by never storing many full cubes and instead streaming them from and to disk in memory is effective or not. Am comparing to points-list with size16, since that seems like it had a pretty small footprint. Important to see if the tradeoff between memory usage and slowdown (I'm fairly sure this will be a tad slower) is worth it.

Results so far (definitely far less memory used :D):

|Implementation |N = 12 | N=13 | |:-------------------------|:------------|:---------| |Disk-only |500 MB |3.6 GB | |points-list (size16)|1.2GB |9.3 GB |

TODO:

[ ] Figure out what to do on collisions (probably easily solved by associating a file offset with a specific hash, and doing a lookup that way). Note for the future: just using the offset in the file as the hash value may be a valid strategy? Lookups may be slow, but at least very easy. Worth trying

[ ] Don't immediately start overwriting the file for N = X when starting for that N
opened by datdenkikniet 6
pcube file format discussion

For (eventual) implementations that may support reading & writing cubes directly from disk to avoid having to store them in RAM, it would be really beneficial if we can write the count of cubes at the end of the file as well.

If writing the count to the end of the file is supported, cubes can be written to it in a streaming fashion and the count can be added at the end. If it's not supported, you have to rewrite the entire file from the beginning in order to fit the LEB128 encoding into the header.

Is there an easy way to support this?

I think we should definitely add a byte to the header that is just flags, with 1 bit explicitly reserved for increasing said header size (if we ever find more than 7 flags we need).

opened by datdenkikniet 4
C++ CMakeLists.txt refactor
Make all target(s) configuration (defines, compile flags) consistent

Use generator expressions for setting Debug/Release options

Make Debug/Release/RelWithDebInfo work for all targets

Optionally build tests: enabled with BUILD_TESTS option

Please check that the compilation options and flags match what they should be on upstream.
opened by JATothrim 4
Rust version: slight improvements to naive implementation
Rename some things so that it's more obvious what they are

Make RawPCube a type so we can easily convert between items coming from/going to a .pcube file and "others"

Split out file handling a bit
opened by datdenkikniet 3
Kevin Gong's methodology?

Has anyone found any details as to how Kevin Gong approached this problem? Apparently he was able to enumerate n=16 in under 11 days time back in 2004 on a much slower/smaller computer than we have today, one would have to imagine he made some sort of breakthrough that no one here has happened across yet.

opened by dandongus 1
Rework hash checking.

Instead of checking all rotations in known IDs for each new cube, first add all rotations of a new cube to a side list of known cubes. Then checking all new cubes that happen to be rotated is faster. Overall improvement is 25-30% for n={7,8,9,10}

opened by emsy 1
Add some traits to the rust version

Add some traits that lets us more properly define what data we take in, and what data we produce out.

Also moves around the enumerate option of the CLI a bit, and puts the CLI code in a subdirectory.

I intend to move hashless, point-list and rotation_reduced over to these traits as well, but fear that the PR will get too big.

I'm pretty bad at making well-sized commits, so squashing this PR is OK.

opened by datdenkikniet 2
Javascript implementation (new cube encoding)
This is currently an order of magnitude faster than python and an order of magnitude slower than c++. I wanted to make a PR to highlight some (possibly) new ideas.

New Canonical Encoding

Given a root cube and an orientation, we can encode a cube by its adjacency graph. I'll illustrate my method by encoding this polycube with 5 cubes.

+--+ / /| +--+--+--+ / / / / | +--+--+--+ | | | | | /| +--+--+--+ | | | / +--+

We choose a cube (cube #0) and the orientation [Left Right Up Down Forward Back]. Record the adjacent cubes to cube #0 as six bits:

+--+ / /| +--+--+--+ / / / / | +--+--+--+ | |#0| | | /| +--+--+--+ | | | / +--+ Cube 0 LRUDFB 010000

We add any adjacent cubes to our cube order. We have marked cube #1. We can now repeat the process for cube #1:

+--+ / /| +--+--+--+ / / / / | +--+--+--+ | |#0|#1| | /| +--+--+--+ | | | / +--+ Cube 0 Cube 1 LRUDFB LRUDFB 010000 110001

There are two new adjacent cubes. They are added to our list of cubes in the same order as our orientation (in this case, first Left, then Right, Up, Down, Forward and Back. This would be different if we picked a different orientation). We now repeat the process for cube #2:

+--+ /#3/| +--+--+--+ / / / / | +--+--+--+ | |#0|#1|#2| /| +--+--+--+ | | | / +--+ Cube 0 Cube 1 Cube 2 LRUDFB LRUDFB LRUDFB 010000 110001 100100

At this point we have completely ordered our cubes. Any additional bits would be redundant. We finish our encoding with six 0 bits.

+--+ /#3/| +--+--+--+ Representation choosing cube #0 and LRUDFB: / / / / | +--+--+--+ | 010000 110001 100100 000000 |#0|#1|#2| /| +--+--+--+ | |#4| / +--+

Define the canonical representation of a cube to be the maximum represention over all choices of first cube and orientation.

+--+ /#3/| +--+--+--+ Canonical representation: / / / / | +--+--+--+ | 111000 010010 000000 |#2|#0|#1| /| +--+--+--+ | |#4| / +--+

This was encoded by choosing the middle cube to be cube #0 - this is the only cube that can be encoded with three ones at the beginning of its adjacency graph, so we do not need to consider any other cubes as root. The representation was maxed when we chose the orientation RLBFDU.

When we parse this representation, we do not know the orientation it was encoded with, so we will get the same cube in (possibly) a different orientation.

Representation: 111000 010010 000000 Parsing orientation: LRUDFB Initial After parsing After parsing state: 6 bits: 12 bits: ___ ___ / /| / /| ___ _+--+ |__ _+--+ |__ / /| / |#3|/ /| / |#3|/ /| +--+ | +--+--+--+ | +--+--+--+ | |#0|/ |#1|#0|#2|/ / /|#0|#2|/ +--+ +--+--+--+ |⎺⎺| +--+--+ |#4|/ ⎺⎺

Note that this canonical representation gives a unique ordering of the cubes, and the cubes are in increasing distance from the root cube.

So we can take away the last cube, leaving a connected polycube. This gives a well-defined manner of assigning a polycube with n-1 cubes to a given polycube with n cubes.

This allows us to not use a hash table and parallelize computation as in issue #11.

Start with a cube p, and extend p by one cube to p+1

Find the canonical representation of p+1, and remove the last cube to create p+1-1

If p+1-1 is the same polycube as p (in some orientation), save p+1

I know other people have already implemented ways of getting rid of a hash table, or splitting computation computation by the dimensions of the polycube. (see issue #27 and PR #26 and #28). These people are much better coders than me, and I haven't had the time to fully read through everything that has been done, but hopefully this is an interesting way of looking at the problem. It's been a lot of fun coding a proof of concept even if I know a javascript implementation isn't going to go anywhere ;)
opened by presseyt 0
Shape encoding

I had an idea for a rotation, reflection invariant method of encoding. I don't have a lot of time to try and implement it right now but I thought I would put it here. It may be expensive in itself but If it reduces the algorithmic complexity it would be worth it.

A single binary 3d matrix can have every possible transformation applied to it so that we have a set of every orientation of the shape as matrices. That set is unique to that shape. If each entry is then flattened, then sorted, and then concatenated you will have a big binary lump that's unique. This can then have a standard hashing function applied to it. It could probably be compressed considerably as well.

opened by bossstein 0
Solution where memory usage isn't an issue

@hidny has come up with a solution where memory usage isn't the limitation. It also appears faster. You can find my implementation in https://gitlab.com/dzamlo/polycubes2

Unfortunaly, I cannot make a pull request because my laptop died.

opened by dzamlo 5
C++ | WIP Splitting work by output shape
This branch splits the problem by output shape. Features:

Threading rework

CacheReader with memory mapped file (less Memory usage)

Output files by output shape (less mem usage, b/c only one set by shape is filled)

Needs:

review and ideas for better iterators and CubeView which overlays cachefile without copying.

~depends on cache files! No independent runs from N=1 to N=X.~

no merge functionality for splitted cache files yet.

Benchmarks: from N-1

whole output in RAM: N=13 T=20: 141s (12700H, 16GB) [peak-mem: ~14GB?! don't remember correctly, but was swapping :) ]

only one shape in RAM: N=13 T=20: 165s (12700H, 16GB) [peak-mem: ~3GB] (slower because clearing hashy is expensive :) ) N=14 T=12: 2500s (AMD 3600, 32GB) [peak-mem: ~24GB] N=14 T=20: 1936s (12700H, 64GB) [peak-mem: ~24GB]
opened by nsch0e 12

Owner

Michael Pound

Assistant Professor in Computer Science

GitHub

Designed as successor to Pretty-Good-Video for improved codec structure, API design & performance

Pretty Fast Video Minimal video codec designed as a successor to Pretty Good Video Goals are to improve: Quality API design Codec structure (Hopefully

36 Jun 5, 2023

A web-based streaming service with improved privacy, performance, and simplicity.

Majiix Majiix is a web-based open-source streaming service. The backend that handles the core streaming logic is written in Rust and makes use of cutt

3 Nov 21, 2023

A community fork of a language named after a plant fungus

A community fork of a language named after a plant fungus. All of the memory-safe features you love, now with 100% less bureaucracy!

1.9k Apr 20, 2023

A collection of COSMIC projects developed by the community.

COSMIC Project Collection A collection of COSMIC projects developed by the community. Applications Name Description Image tasks A simple task manageme

94 Jul 25, 2024

Run the right version of python, in the right environment, for your project

rpy Do you deal with lots of virtual python environments? rpy is for you! Before rpy: ~/dev/prj$ env PYTHONPATH=src/py path/to/my/interpreter src/py/m

2 Dec 8, 2022

An experimental project for rust version of Hertz written by GPT4.

Ryze: An experimental Rust Web Framework Ryze is a minimal web framework for Rust inspired by Hertz and written by GPT4. Example Here is a simple exam

3 Nov 28, 2023

SKYULL is a command-line interface (CLI) in development that creates REST API project structure templates with the aim of making it easy and fast to start a new project.

SKYULL is a command-line interface (CLI) in development that creates REST API project structure templates with the aim of making it easy and fast to start a new project. With just a few primary configurations, such as project name, you can get started quickly.

4 May 9, 2023

A more intuitive version of du in rust

3k Sep 20, 2021

IntelliJ version of the Afterglow Sublime Text theme

Afterglow IntelliJ This theme for IntelliJ is based on the the Afterglow Sublime Text theme, and replaces the default sidebar icons and colour of Inte

81 Jun 29, 2022

Nvm - Node Version Manager - POSIX-compliant bash script to manage multiple active node.js versions

Node Version Manager Table of Contents Intro About Installing and Updating Install & Update Script Additional Notes Troubleshooting on Linux Troublesh

63.8k Jan 7, 2023

Compiler for an "extended" version of the Mindustry logic language

Minblur Compiler Minblur is a compiler for a superset of "logic" programming language in the game Mindustry. It helps reduce code-duplication, making

15 May 2, 2022

Vyper-Compiler Version Manager in Rust

Vyper Compiler Version Manager in Rust Install $ cargo install --git https://github.com/storming0x/vvm-rs --locked vvm-rs Install from source $ git c

26 Dec 15, 2022

Python PEP-440 Version Parsing

PyVer (WIP) Python PEP-440 Version Parsing This package allows for parsing Python PEP-440 version numbers and comparisons between PEP-440 Versions Usa

3 Sep 18, 2022

An enhanced version of filetime, which can set file creation time on Windows.

filetime_creation Documentation An enhanced version of filetime, which can set file creation time on Windows. Internally, this use SetFileTime Win32 A

4 Dec 5, 2022

Python PEP-440 Version Parsing

PyVer Python PEP-440 Version Parser This package allows for parsing Python PEP-440 version numbers and for comparisons between PEP-440 version numbers

3 Sep 18, 2022

A bring-your-own-mutex version of once_cell.

generic_once_cell generic_once_cell is a generic no_std version of once_cell. Internal synchronization for initialization is provided as type paramete

3 Nov 28, 2022

A library for python version numbers and specifiers, implementing PEP 440

PEP440 in rust A library for python version numbers and specifiers, implementing PEP 440 Not yet on crates.io due to PyO3/pyo3#2786. use std::str::Fro

9 Dec 22, 2022

A version control system implemented from scratch in Rust.

Version Control An experiment to write a version control system from scratch in Rust. CLI Usage Usage: revtool <COMMAND> Commands: init initia

3 May 3, 2023

Fast KubeJS script manager. Includes version control and compatibility with KJSPKG packages.

CarbonJS A KubeJS script manager Features ?? Super fast ⚙️ Version control ?? Constantly new scripts being added ✅ Easy to use ?? Compatibility with K

3 May 9, 2023

A community improved version of the polycubes project!

Related tags

Overview

Polycubes

Introduction

Contents

Improving the code

Contributing!

References

Comments

New Canonical Encoding

Owner

Michael Pound

Designed as successor to Pretty-Good-Video for improved codec structure, API design & performance

A web-based streaming service with improved privacy, performance, and simplicity.

A community fork of a language named after a plant fungus

A collection of COSMIC projects developed by the community.

Run the right version of python, in the right environment, for your project

An experimental project for rust version of Hertz written by GPT4.

SKYULL is a command-line interface (CLI) in development that creates REST API project structure templates with the aim of making it easy and fast to start a new project.

A more intuitive version of du in rust

IntelliJ version of the Afterglow Sublime Text theme

Nvm - Node Version Manager - POSIX-compliant bash script to manage multiple active node.js versions

Compiler for an "extended" version of the Mindustry logic language

Vyper-Compiler Version Manager in Rust

Python PEP-440 Version Parsing

An enhanced version of filetime, which can set file creation time on Windows.

Python PEP-440 Version Parsing

A bring-your-own-mutex version of once_cell.

A library for python version numbers and specifiers, implementing PEP 440

A version control system implemented from scratch in Rust.

Fast KubeJS script manager. Includes version control and compatibility with KJSPKG packages.