Coder Social home page Coder Social logo

djeada / algorithms-and-data-structures Goto Github PK

View Code? Open in Web Editor NEW
445.0 3.0 70.0 4.47 MB

A collection of 100+ projects in C++ and Python that implement various data structures and algorithms. The projects are organized by language and topic, and include detailed explanations and examples to help you understand how they work.

Home Page: https://adamdjellouli.com/articles/algorithms_and_data_structures

License: MIT License

Python 36.67% CMake 19.58% C++ 42.45% C 0.17% Shell 0.89% Mermaid 0.24%
algorithms data-structures dynamic-programming collections backtracking sorting-algorithms

algorithms-and-data-structures's Introduction

Algorithms-And-Data-Structures

GitHub stars GitHub forks GitHub license

This repository contains a collection of projects in C++ and Python that implement various data structures and algorithms. The projects are organized by language and topic, and include detailed explanations and examples to help you understand how they work.

algorithms

About

Ever since I first tackled Algorithms and Data Structures at university in 2015, I've found it super useful to regularly go back to the basics. This becomes even more important when you're trying to learn a new programming language - a strong foundation is key. To help others, I've decided to share my code and notes on the subject with everyone.

My code is written in two programming languages I really enjoy, C++ and Python. I've done my best to stick to the latest best practices. Alongside the code, you'll find the notes I made while learning. These notes give more context and could be really handy for anyone new to Algorithms and Data Structures.

Requirements

The following requirements are necessary to build and run the code in this repository:

  • For C++ projects:
    • A C++ compiler supporting C++14
    • CMake 3.15 or later
  • For Python projects:
    • Python 3.10 or later

No additional libraries or modules are required.

Running the Examples

This repository is organized into distinct algorithm implementations, each contained in its own subdirectory. These subdirectories provide the source code, unit tests, and build configuration files necessary for each algorithm. Because each algorithm forms a separate project, you should handle the build and test processes individually.

Building and Testing C++ Projects

Building and testing C++ projects involve a sequence of steps. Here's a detailed walkthrough:

  1. Navigate to the project directory: Start by moving into the directory containing the specific project you want to build and test.

  2. Create and navigate into the build directory:

mkdir -p build && cd build

This command creates a new directory named build (if it doesn't already exist) and then navigates into it. The build directory is where the output files of the build process will be stored.

  1. Generate the build files with CMake:
cmake ..

This command runs CMake to generate the build files. .. tells CMake to look for the CMakeLists.txt file in the directory above build.

  1. Build the project:
make

This command compiles the source code using the instructions specified in the CMakeLists.txt file.

  1. Run the unit tests:
ctest --verbose

The ctest --verbose command executes the unit tests and uses the verbose flag to provide a detailed output.

Testing Python Projects

To test a Python project, execute the following command in the project directory:

python -m unittest discover -v

This command uses Python's built-in unittest module to discover and run the tests. The -v (verbose) flag is used to get more detailed output from the tests.

Using the Testing Utility Script

For convenience, this repository includes a utility script named run_tests.sh. Execute the following commands from the repository's root to run tests in all subprojects:

  • To run all unit tests: ./run_tests.sh
  • To run all Python tests: ./run_tests.sh --python
  • To run all C++ tests: ./run_tests.sh --cpp
  • To read all options from terminal: ./run_tests.sh --help

Code Formatting Conventions

Consistent code formatting is essential for maintaining readability and understanding of the codebase. Therefore, we have adopted specific formatting guidelines for each programming language used in this repository.

C++ Formatting

We adhere to the Google C++ Style Guide. To automatically format the code, we use clang-format. Use the following command to format your code:

find . -regex '.*\\.(cpp|hpp|cu|c|h)' -exec clang-format -style=file -i {} \\;

This command recursively finds all files with .cpp, .hpp, .cu, .c, or .h extensions and formats them using clang-format.

CMake Formatting

CMake files should have a consistent style as well. For this, we use cmake-format. To format a CMakeLists.txt file, execute the following command:

cmake-format CMakeLists.txt -i

This command applies the cmake-format to the CMakeLists.txt file.

Python Formatting

We follow the PEP 8 - Style Guide for Python Code for Python projects and use black to automatically format the code. Use the following command to format your Python code:

black .

This command formats all Python files in the current directory and its subdirectories using black.

Notes

List of projects

Data structures

# Structure Implementation
1 Linked List Python Cpp
2 Vector Python Cpp
3 Stack Python Cpp
4 Queue Python Cpp
5 Heap Python Cpp
6 Binary Search Tree Python Cpp
7 Avl Tree Python Cpp
8 Red Black Tree Python Cpp
9 Hash Table Python Cpp

Graphs

# Algorithm Implementation
1 General graph Python Cpp
1 Is Bipartite? Python Cpp
2 BFS Python Cpp
3 DFS Python Cpp
4 Dijkstra Python Cpp
5 A* Python Cpp
6 Bellman-Ford Python Cpp
7 Kruskal Python Cpp
8 Prim Python Cpp

Backtracking

# Problem Solution
1 Permutations Python Cpp
2 Combinations Python Cpp
3 String Pattern Python Cpp
4 Generating words Python Cpp
5 K-colorable configurations Python Cpp
6 Hamiltonian paths Python Cpp
7 Knigt tour Python Cpp
8 Topological orderings Python Cpp
9 Tic-tac-toe (minimax) Python Cpp

Dynamic Programming

# Problem Solution
1 Fibonacci Python Cpp
2 Grid travelers Python Cpp
3 Stairs Python Cpp
4 Can sum Python Cpp
5 How sum Python Cpp
6 Best sum Python Cpp
7 Can construct Python Cpp
8 Count construct Python Cpp
9 All construct Python Cpp
10 Coins Python Cpp
11 Longest increasing subsequence Python Cpp
12 Longest common subsequence Python Cpp
13 Knuth-Morris-Pratt Python Cpp
14 Minimum insertions to form a palindrome Python Cpp

Sorting

# Algorithm Implementation
1 Insertion sort Python Cpp
2 Selection sort Python Cpp
3 Heap sort Python Cpp
4 Merge sort Python Cpp
5 Quick sort Python Cpp

Brain Teasers

# Problem Solution
1 Minimum deletions to make valid parentheses Python Cpp
2 Is palindrome after at most one char deletion? Python Cpp
3 K closest points to origin Python Cpp
4 Subarray sum equals K Python Cpp
5 Add numbers given as strings Python Cpp
6 Dot product of two sparse vectors Python Cpp
7 Range sum of BST Python Cpp
8 Product of array except self Python Cpp
9 Convert BST to sorted doubly linked list Python Cpp
10 Lowest common ancestor of a binary tree Python Cpp
11 LRU Cache Python Cpp
12 Randomize An Array Python Cpp
13 Binary Tree Right Side View Python Cpp
14 Design Browser History Python Cpp
15 Score After Flipping Matrix Python Cpp

How to Contribute

We encourage contributions that enhance the repository's value. To contribute:

  1. Fork the repository.
  2. Create your feature branch (git checkout -b feature/AmazingFeature).
  3. Commit your changes (git commit -m 'Add some AmazingFeature').
  4. Push to the branch (git push origin feature/AmazingFeature).
  5. Open a Pull Request.

References

Books

  • Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford
    Introduction to Algorithms, 3rd Edition (The MIT Press)
    Amazon Link

  • Halim, Steven
    Competitive Programming 3
    Amazon Link

  • Karumanchi, Narasimha
    Data Structures and Algorithms Made Easy: Data Structures and Algorithmic Puzzles
    Amazon Link

  • Kernighan, Brian; Ritchie, Dennis
    The C Programming Language
    Amazon Link

  • Skiena, Steven; Revilla, Miguel
    Programming Challenges: The Programming Contest Training Manual
    Amazon Link

  • Laaksonen, Antti
    Guide to Competitive Programming: Learning and Improving Algorithms Through Contests (Undergraduate Topics in Computer Science)
    Amazon Link

  • Nimajneb, Nite
    The Hitchhiker’s Guide to the Programming Contests

  • Prasad, L. V. Narashima; Patro, E. Kishna Rao
    Lecture Notes on Data Structures using C

Online Courses and Resources

License

This project is licensed under the MIT License - see the LICENSE file for details.

Star History

Star History Chart

algorithms-and-data-structures's People

Contributors

djeada avatar copilot avatar colevanderswands avatar welding-torch avatar

Stargazers

Vishal Katariya avatar Rahul Maurya avatar Thompson (ttomsin) avatar  avatar Uvathe avatar Ming-Chi Liu avatar Duy Thnh avatar Ashwin Aravind avatar  avatar Matt Yates avatar  avatar  avatar  avatar 0x42 avatar Arty Belanger avatar Ankit Shrivastava avatar Jeff Carpenter avatar Conner Houdek avatar Saurav Sharan avatar  avatar JYOTSHANA S R avatar Omar Abdalgwad avatar  ENGYAHYE avatar Harneet Tatla avatar  avatar c avatar Dipayan SInha avatar lyyuser avatar Aditi avatar  avatar Chaitanya avatar Aaron Yap avatar  avatar 23f2001123 avatar Hamdaouy Mohamed avatar  avatar Hannan avatar Apurv D. Kulkarni avatar brianestimejr avatar  avatar Surya avatar Adnan Ahmed avatar Atharv Singh avatar Trần Đức Quang avatar Prakhil Lohiya avatar alan avatar  avatar Gabriel Fernandes avatar  avatar  avatar Ateng avatar  avatar Dam Phu Quy avatar  avatar Vladimir avatar Benheddi Youssouf avatar Chistr avatar Ibrahim Zaman avatar  avatar Kyle Otterstedt avatar Misha avatar Adrian Staniec avatar Fenera Taye avatar TRISHA SHARMA avatar Sam avatar Anushka Sharma avatar Sean Liang avatar Samita Khanduri avatar Kunal Luthra avatar Bjarte Botnevik avatar Muhammad Ahmed avatar Nathan Santos avatar  avatar Lucas Fernandes avatar vineeth shettigar avatar Raingsey SAMOL avatar Rafael Henrique avatar  avatar  avatar aendriu avatar Mudassir Ashraf avatar Binayak Mishra avatar Abdullah Gull avatar pradeep avatar Viraj Potdar avatar Chandal Raju Pawankumar Raju avatar  avatar  avatar  avatar Varshini avatar Guus Bouwens avatar Shailesh Jadav avatar bisckoot avatar Anmol Desai avatar  avatar Hemanth sai Madadapu avatar Naina Bhalla avatar  avatar Harshi Ch avatar Aman Singh avatar

Watchers

 avatar Krzysztof Kowalczyk avatar Shahhill Islam avatar

algorithms-and-data-structures's Issues

fix notes greedy algorithms

What’s holding it back (clarity + flow)

1) The page reads like separate mini-notes, not one narrative

Each section repeats a similar pattern (problem → baseline → greedy → pseudo → complexity), which is good… but the transitions don’t build a single “greedy toolbox.” A reader finishes thinking “I learned 7 algorithms,” more than “I can design/prove greedy algorithms now.”

Cleaner approach: make the greedy design/proof checklist the spine of the article, and then show each example as an application of that checklist.

A practical checklist you can reuse in every section:

  1. Greedy choice: what local choice do we make?
  2. Feasibility: what constraint must always remain true? (the invariant)
  3. Why safe: exchange argument / cut property / “first time we differ” swap
  4. Implementation: what data structure makes the choice fast?
  5. Complexity: what dominates runtime?

Right now, you have the ingredients (exchange/invariant, cut/cycle rules, etc.), but you don’t reuse the same proof skeleton explicitly across sections. ([adamdjellouli.com][1])

2) Matroids show up too early (and a bit “drive-by”)

The matroid paragraph is accurate vibe-wise, but it’s a conceptual jump before the reader has fully internalized the “greedy-choice + exchange” mechanic. ([adamdjellouli.com][1])

Better flow: either

  • move matroids to an “Optional: when greedy is automatically optimal” section near the end, or
  • keep it early but add a 2–3 line bridge: “MST and interval scheduling are both matroid-ish (graphic matroid / interval scheduling as a special structure), that’s why greedy works so often here.”

3) Some sections are “greedy-adjacent” but you don’t label them that way

Maximum contiguous sum is commonly taught as DP (Kadane) or prefix-min trick; it can be explained greedily (“drop a negative-running prefix”), but readers may wonder why it’s in a greedy article unless you explicitly connect it to the greedy-choice property. ([adamdjellouli.com][1])

A simple fix: add a short line like: “This is greedy in the sense that any negative prefix is never worth keeping, so we discard it immediately.”

4) Formatting/consistency issues interrupt comprehension

Across the article you switch between:

  • 0-based indexing (jump game: squares 0..n-1), ([adamdjellouli.com][1])
  • 1-based indexing (max subarray uses positions 3..5 and formulas with (S_0), (S_j=\sum_{k=1}^j x_k)), ([adamdjellouli.com][1])
  • pseudocode loops that assume 1-based arrays (for j in 1..n: S = S + x[j]). ([adamdjellouli.com][1])

That’s survivable, but it adds friction. Pick one convention (0-based is typical in code) and stick to it everywhere.


Concrete errors / things to fix

Interval scheduling: example contradiction (max count is not 4 with given input)

You list intervals:

(1,3) (2,5) (4,7) (6,9) (8,10) (9,11) ([adamdjellouli.com][1])

Then you claim:

“A best answer keeps four intervals, for instance (1,3),(4,7),(8,10),(10,11).” ([adamdjellouli.com][1])

But (10,11) is not in the input set. With the given list, the greedy run you show keeps 3 intervals, not 4—and (in fact) 3 is the true maximum for that set.

Fix options:

  • Add (10,11) to the input list, or
  • Change the claim to “keeps three intervals,” and remove the 4-interval example.

Max contiguous sum: typo in the incremental form

You wrote:

(E \leftarrow \max(x_j,; E+x_j)) ([adamdjellouli.com][1])

That stray ; breaks the expression. Should be:

  • (E \leftarrow \max(x_j,\ E+x_j))

Huffman baseline math has a small notation bug

You wrote:

“minimizing (\sum f_i,L_i)” ([adamdjellouli.com][1])

That comma is almost certainly unintended; should be:

  • (\sum_i f_i L_i)

Huffman section: dangling/unfinished inequality rendering

Near the end:

“It also matches the information-theoretic bound (H(f)\le \mathbb{E}[L]” ([adamdjellouli.com][1])

This looks cut off (missing closing delimiter and likely the rest of the statement). Typically you’d say something like:

  • (H(f) \le \mathbb{E}[L] < H(f)+1) (for Huffman codes)

At minimum: close the math and finish the sentence.

Huffman: over-strong statement about ties/lengths

You say:

“There can be multiple optimal codebooks when there are ties in frequencies; their lengths agree, though the exact bitstrings may differ.” ([adamdjellouli.com][1])

This is often true in practice for unequal frequencies, but with ties, assignment of lengths to specific symbols can differ among optimal trees (especially for equal-frequency symbols). Safer wording:

  • “Expected length is the same; codewords/leaf assignments may differ (especially among equal-frequency symbols).”

A cleaner, more fluid structure (suggested rewrite outline)

Here’s a structure that keeps your content but reads smoother and teaches “transferable greedy thinking”:

  1. What greedy is (1 screen max)

    • Local choice + feasibility
    • One punchy warning: “Greedy isn’t magic; you must prove the choice is safe.” ([adamdjellouli.com][1])
  2. The Greedy Proof Toolkit (the backbone)

    • Exchange argument template (with a 4-line generic template) ([adamdjellouli.com][1])
    • Loop invariant template (state → init → maintain → terminate) ([adamdjellouli.com][1])
    • (Optional box) Cut/cycle rules as specialized exchange arguments (foreshadow MST/Dijkstra). ([adamdjellouli.com][1])
  3. Examples grouped by “greedy pattern”

    • Frontier / reachability greedy: Jump game (invariant: “F is furthest reachable so far”). ([adamdjellouli.com][1])
    • Cut-based greedy on graphs: MST + Dijkstra (explicitly say: “each step picks a cheapest safe edge / settled node”). ([adamdjellouli.com][1])
    • Scheduling greedy: interval scheduling + EDD lateness (both classic exchange arguments). ([adamdjellouli.com][1])
    • Merge-the-two-smallest greedy: Huffman (the “cost increases by p+q” argument is great—keep it). ([adamdjellouli.com][1])
    • Discard-negative-prefix greedy: max subarray (explicitly connect it to the greedy decision). ([adamdjellouli.com][1])
  4. When greedy is guaranteed

    • Matroids (optional) + a 2-line explanation of why it’s the “formal version of exchange arguments.” ([adamdjellouli.com][1])
  5. Common failure mode (one short counterexample)

    • e.g., coin change with a non-canonical coin system, or scheduling by earliest start (fails).
      This prevents the page from feeling like “greedy always works.”

Quick style/clarity improvements that pay off immediately

  • Put the Table of Contents near the top, not after Huffman (right now it appears at the end in the scraped view). ([adamdjellouli.com][1])
  • Standardize each section to the same mini-template, with consistent headings:
    Problem → Greedy rule → Algorithm → Proof sketch → Complexity → Edge cases.
  • Choose one indexing convention (0-based is easiest if your pseudocode looks like Python/C++).
  • For graph algorithms, explicitly state whether the example is directed vs undirected and whether you’re assuming connected (MST already says connected—good). ([adamdjellouli.com][1])
  • If you keep pseudocode language-agnostic, add 1–2 clarifying lines like “heap may contain stale entries; we skip them” (you did this nicely in Dijkstra). ([adamdjellouli.com][1])

fix notes searching algorithms

What’s holding it back (flow + scope)

1) The scope is bigger than the title suggests, but not framed as such

Your intro says searching in “array, list, tree, or graph,” but the article doesn’t actually go into tree search (BST), graph search (BFS/DFS), or index structures (B-trees). ([adamdjellouli.com][1])
That creates an expectation gap.

Fix: either (A) adjust the intro to match what’s actually covered (arrays + hashing + probabilistic membership + string matching), or (B) add short sections for BST search and BFS/DFS so the intro is true.

2) It’s a “list of algorithms” instead of a “decision guide”

Right now the reader has to already know when to use jump vs exponential vs interpolation vs hashing vs BM/KMP, etc.

Cleaner approach: open with a “Which search should I use?” decision tree, then present the algorithms as drill-downs.

Example decision tree outline:

  • Unsorted data + need exact match? → Linear / Sentinel linear ([adamdjellouli.com][1])

  • Sorted array + random access? → Binary (default), then:

    • Target likely near start / unknown bounds? → Exponential ([adamdjellouli.com][1])
    • Values ~uniform numeric? → Interpolation (with caveats) ([adamdjellouli.com][1])
    • Want blocky scanning behavior? → Jump (but explain when it beats binary in practice) ([adamdjellouli.com][1])
  • Key → value lookup / membership in set? → Hash tables (chaining vs probing vs cuckoo) ([adamdjellouli.com][1])

  • Approximate membership (space-first)? → Bloom / Counting Bloom / Cuckoo filter ([adamdjellouli.com][1])

  • Substring search? → KMP / Boyer–Moore / Rabin–Karp ([adamdjellouli.com][1])

This turns the article into something people can apply, not just read.

3) Consistency: the tone slips once in a way that looks accidental

In the Jump Search section, there’s a paragraph that reads like an assistant/meta rewrite note (“Perfect — that’s a jump search trace. Let me reformat…”) which breaks the “notes-style” voice and looks unintentional. ([adamdjellouli.com][1])

Fix: rewrite that intro line to match your usual tone (e.g., “Here’s a clear trace of jump search…”).


Concrete errors / contradictions to fix

1) Quadratic probing example output contradicts the actual trace

You say for quadratic probing (m=11, stored {22,33,44}, target 33):

  • Output: “found (index 4)” ([adamdjellouli.com][1])
  • But your insertion trace places 33 at index 1 (“Insert 33 … place at 1”), and 44 at index 4. Then your Search(33) trace finds it at index 1. ([adamdjellouli.com][1])

Fix: change the output to index 1, or change the insertion story so 33 ends up at 4.

2) Quadratic probing sequence line has a stray/garbled term

You list squares mod 11 as:
“+1, +4, +9, +5, +3, +3²…” ([adamdjellouli.com][1])
That “+3²” is confusing (it looks like you restarted the notation mid-list).

Fix: either show the clean sequence of offsets (1,4,9,5,3,3,5,9,4,1,…) or don’t expand it at all—just show the rule.

3) KMP explanation has a spacing/glitch in the sample text

You write the text as "ababcabca babab d" (with spaces inserted) while earlier you define it as "ababcabcabababd". ([adamdjellouli.com][1])

Fix: remove the stray spaces so readers don’t wonder if it’s a different string.


Smaller clarity / style fixes that will make it “cleaner and more fluid”

  • Tighten repeated boilerplate. Many sections repeat “Example inputs and outputs” → “How it works” → bullets. Consider a uniform mini-template:
    When to use → Assumptions → Core idea → 1 worked example → Complexity → Pitfalls.
  • Be explicit about indexing (0-based) once, then enforce it everywhere. You do call out 0-based early—good—just keep all later examples aligned to that convention. ([adamdjellouli.com][1])
  • Jump Search justification: you say it’s useful when “full binary search isn’t desirable” (branch prediction, cache, etc.). That’s plausible, but it reads a bit hand-wavy; adding a single line like “binary search has poor locality; jump search scans a block sequentially” would ground it. ([adamdjellouli.com][1])
  • Cuckoo filter formula precedence: i2 = i1 XOR H(f) mod b is easy to misread (is mod applied to H(f) or to the XOR result?). Add parentheses for clarity. ([adamdjellouli.com][1])
  • Grammar nit: “Let say we have following array…” → “Let’s say we have the following array…” ([adamdjellouli.com][1])

Suggested “v2” outline (same content, better structure)

  1. Quick chooser (decision tree + table)

  2. Exact search in arrays

  3. Key-based lookup

  4. Approximate membership

    • Bloom → counting Bloom → cuckoo filter (compare tradeoffs) ([adamdjellouli.com][1])
  5. String matching

    • naive → KMP → Boyer–Moore → Rabin–Karp (when each wins) ([adamdjellouli.com][1])

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.