✒️
SEI 802
  • Introduction
  • About These Notes
  • Syllabus
  • Development Workflow
    • Installfest
      • Mac OSX
      • Linux
      • Git Configuration
      • Sublime Packages
    • Command Line
      • The Terminal
      • Filesystem Navigation
      • File Manipulation
      • Additional Topics
    • Intro to Git
      • Version Control
      • Local Git
      • Remote Git
      • Git Recipes
    • Group Collaboration
      • Git Workflows
      • Project Roles and Tools
    • VS Code Tips & Tricks
  • HTML/CSS
    • HTML
    • CSS Selectors
    • CSS Box Model and Positioning
      • Box Model
      • Display and Positioning
      • Flexbox
      • Grid
      • Flexbox & Grid Games
      • Floats and Clears
      • Additional Topics
    • Advanced CSS
      • Responsive Design
      • Pseudo-Classes/Elements
      • Vendor Prefixes
      • Custom Properties
      • Additional Topics
    • Bootstrap
    • CSS Frameworks
    • Accessibility
  • JavaScript
    • Primitives
    • Arrays
    • Objects
    • Control Flow
      • Boolean Expressions
      • Conditionals
      • Loops
      • Promises
    • Functions
      • Callbacks
      • Timing Functions
      • Iterators
    • DOM and Events
    • DOM Manipulation
    • HTML5 Canvas
    • How To Reduce Redundancy
    • (2019) JavaScript OOP
    • (2016) OOP with Classes
    • (1995) OOP with Prototypes
      • Constructors
      • Prototypes
    • Intro to TDD
    • Scoping
    • Inheritance
      • Prototypal Inheritance
      • Call, Apply, and other Functions
      • ES6 Inheritance
      • Resources
    • Custom Node Modules
    • Additional Topics
      • AJAX, Fetch, and Async/Await
      • AJAX w/JSON and Localstorage
        • AJAX w/JSON
        • Local Storage
      • Async module
      • Data Scraping
  • jQuery
    • Intro
      • DOM Manipulation
      • Reddit Practice
      • Styling
      • Events
    • Plugins
    • AJAX
  • APIs
    • Fetch
    • AJAX w/jQuery
    • AJAX w/Fetch
  • Databases
    • Intro to SQL
    • Advanced SQL
    • MongoDB
      • Intro to NoSQL
      • CRUD in MongoDB
      • Data Modeling
      • Intermediate Mongo
  • Node/Express
    • Node
      • Intro to Node
      • Node Modules
      • Node Package Manager (NPM)
    • Express
      • Intro to Express
        • Routes
        • Views
        • Templates
        • Layouts and Controllers
        • CRUD & REST
          • Get and Post
          • Put and Delete
      • APIs with Express (request)
      • APIs with Express (axios)
    • Sequelize
      • Terminology
      • Setup
      • Using Models
      • Seeding Data
      • Validations and Migrations
      • Resources
      • 1:M Relationships
      • N:M Relationships
    • Express Authentication
      • Research Components
      • Code Components
      • Auth in Theory
        • Sessions
        • Passwords
        • Middleware
        • Hooks
      • Auth in Practice
        • Create the User
        • User Signup
        • Sessions
        • User Login
        • Authorization and Flash messages
    • Testing with Mocha and Chai
    • Mongoose
      • Mongoose Associations
    • JSON Web Tokens
      • Codealong
    • Additional Topics
      • oAuth
      • Geocoding with Mapbox
      • Geocoding and Google Maps
      • Cloudinary
      • Websockets with Socket.io
      • SASS
  • Ruby
    • Intro to Ruby
    • Ruby Exercises
    • Ruby Classes
    • Ruby Testing with Rspec
    • Ruby Inheritance
    • Ruby Data Scraping
  • Ruby on Rails
    • Intro to Rails
    • APIs with Rails
    • Asset Pipeline
    • Rails Auth and 1-M
      • Auth Components
    • Rails N:M
    • ActiveRecord Polymorphism
    • Additional Topics
      • oAuth
      • SASS
      • Rails Mailers
      • Cloudinary
      • Jekyll
  • React (Updated 2019)
    • ES6+/ESNext
      • Const and Let
      • Arrow Functions
      • Object Literals and String Interpolation
      • ES6 Recap
      • ES6 Activity
    • Intro to React
      • Create React App
      • Components and JSX
      • Virtual DOM
      • Props
      • Dino Blog Activity
      • Nested Components
      • Lab: LotR
    • React State
      • Code-Along: Mood Points
      • Code-Along: Edit Dino Blog
      • Lab: Simple Calc
      • Lifting State
    • React Router
      • Browser History/SPAs
      • React Router (lesson and full codealong)
      • Router Lab
    • Fetch and APIs
      • APIs with Fetch and Axios
      • Fetch the Weather
    • React Hooks
    • React LifeCycle
      • Lab: Component LifeCycle
    • React Deployment
    • Additional Topics
      • React Frameworks
        • Material UI Theming
      • Typescript
        • More Types and Syntax
        • Tsconfig and Declaration Files
        • Generics with Linked List
      • Redux
      • TypeScript
      • Context API
      • React Native
  • Meteor
  • Deployment and Config
    • Deploy - Github Pages
    • Deploy - Node/Sequelize
    • Deploy - Node/MongoDB
    • Deploy React
    • Deploy - Rails
      • Foreman (Environment Variables)
    • Deploy - AWS Elastic Beanstalk
    • Deploy - S3 Static Sites
    • Deploy - Django
    • Deploy - Flask
  • Data Structures and Algorithms
    • Recursion
    • Problem Solving - Array Flatten
    • Binary Search
    • Algorithm Complexity
    • Stacks and Queues
    • Bracket Matching
    • Ruby Linked Lists
      • Sample Code
      • Beginner Exercises
      • Advanced Exercises
    • JS Linked Lists
      • Sample Code
      • Beginner Exercises
      • Beginner Solutions
    • Hash Tables
    • Intro to Sorting
    • Insertion Sort
    • Bucket Sort
    • Bubble Sort
    • Merge Sort
    • Quick Sort
    • Heap Sort
    • Sorting Wrapup
    • Hashmaps
    • Trees and Other Topics
  • Python
    • Python Installation
    • Intro to Python
    • Python Lists
    • Python Loops
    • Python Dictionaries
    • Python Sets and Tuples
    • Python Cheatsheet
    • Python Functions
    • Python Classes
    • Python Class Inheritance
    • Intro to Flask
    • Intro to SQLAlchemy
      • Flask and SQLAlchemy
    • Using PyMongo
    • Intro to Django
    • CatCollector CodeAlong
      • URLs, Views, Templates
      • Models, Migrations
      • Model Form CRUD
      • One-to-Many Relations
      • Many-to-Many Relations
      • Django Auth
    • Django Cheatsheet
    • Django Auth
    • Django Polls App Tutorial
    • Django School Tool Tutorial
    • Django 1:M Relationships
    • Custom Admin Views
    • Data Structures and Algorithms
      • Recursion
      • Binary Search
      • Stacks and Queues
      • Linked Lists
      • Binary Trees
      • Bubble Sort
      • TensorFlow & Neural Networks
    • Adjacent Topics
      • Raspberry Pi
      • Scripting
  • Assorted Topics
    • History of Computer Science
    • Regular Expressions
    • Intro to WDI (Course Info)
    • Being Successful in WDI
    • Internet Fundamentals
      • Internet Lab
    • User Stories and Wireframing
      • Wireframing Exercise: Build an Idea
    • Post WDI
      • Learning Resources
      • Deliverables -> Portfolio
      • FAQ
  • Projects
    • Project 1
    • Project 2
    • Project 3
      • Project 3 Pitch Guidelines
    • Project 4
    • Past Projects
      • Project 1
      • Project 2
      • Project 3
      • Project 4
      • Portfolios
    • Post Project 2
    • MEAN Hackathon
      • Part 1: APIs
      • Part 2: Angular
    • Portfolio
  • Web Development Trends
  • Resources
    • APIs and Data
    • Tech Websites
    • PostgreSQL Cheat Sheet
    • Sequelize Cheat Sheet
    • Database Administration
  • Archived Section
    • (Archived) ReactJS
      • Intro to React
        • Todo List Codealong
        • Additional Topics
      • Deploy React
      • React with Gulp and Browserify
        • Setting up Gulp
        • Additional Gulp Tasks
      • React Router
        • OMDB Router
        • OMDB Search
        • Additional Resources
      • React Animations
        • CSS Animations
    • AngularJS
      • Intro to AngularJS
        • Components and SPA
        • Create an Angular App
      • Angular Directives and Filters
      • Angular Animation
      • Angular Bootstrap Directives
        • Bootstrap Modals
      • Angular $http
      • Angular Services
        • Service Recipes
        • ngResource
        • Star Wars Codealong
      • Angular Routing
      • Angular + Express
      • Angular Authentication
        • Additional Topics
      • Angular Components
      • Angular Custom Filters
      • Angular Custom Directives
Powered by GitBook
On this page
  • Algorithm Analysis
  • Characteristics
  • The Algorithm
  • Calculating the Correct Bucket
  • Implement it!
  • Bonus:

Was this helpful?

  1. Data Structures and Algorithms

Bucket Sort

Bucket sort is a distribution sort in which we take the original unsorted elements in an array and distribute them into a set of buckets. Each bucket is meant to hold a range of element values. After we have distributed all the elements from the array into the buckets, we sort each bucket and then concatenate them all into a single array again. At the end of the process, the original array will be sorted.

Algorithm Analysis

In the best and average cases, bucket sort operates at O(n+k) time efficiency where n is the number of elements in the array and k is the number of buckets you create for sorting. Every operation that we do in a bucket sort is actually O(n) which is "linear time" but the final concatenation of the buckets can be done in constant time, or O(k). As a result, our efficiency for this algorithm is the sum of both of those: O(n+k). In the worst case, bucket sort can perform like a quadratic search at O(n2).

This worst case can happen when we don't have a very even distribution of the values to be sorted. If we have a bunch very near each other, they will all be sorted into the same bucket. This will increase the time needed to sort to closer to a quadratic. As a result, bucket sort is not a great option when you have a lot of repeats or an uneven range of values.

Characteristics

  • Distribution Sort - puts values into auxilliary data structures

  • Not in place - requires additional space and returns a new sorted array

  • Stable - preserves original ordering of duplicates

The Algorithm

  • The function takes as parameters the array to be sorted (arr) and the number of buckets to create (k). (Remember that the number of elements in the array/list and the number of buckets determines the efficiency of this algorithm since it is O(n+k).)

  • Create a new array of k empty buckets (array of arrays / list of lists).

  • Iterate over the unsorted array to find the maximum value.

  • Iterate over that array again to scatter each element into the appropriate bucket.

  • Iterate over the buckets array and sort each bucket (usually we use insertion sort).

  • Return all the buckets concatenated in order.

Calculating the Correct Bucket

This can actually be done mathematically if we know both the number of buckets and the maximum value in our unsorted array. Consider a list of integers:

ints = [19, 28, 61, 32, 17, 59, 48, 4, 10, 74, 39, 69]

Our maximum value in here is 74. We don't want too few buckets but we also don't want too many. Having one bucket for each element actually becomes a different kind of sort known as pidgeonhole sort. With 12 values in the list, why don't we choose 6 which will make sorting each bucket extremely easy. Then the formula for calculating the bucket for each unsorted element is:

bucket_index = math.floor((unsorted_value / maximum + 1) * k)

NOTE: As we will see, this works great for integers but will need to be tweaked for other types like floats or strings.

Let's see if we can understand what this is doing. When we divide the unsorted value by the maximum value in our list plus one, we get a fractional value that helps us approximate where it should be placed in our range. Values closer to the max will have a ratio closer to 1 while lower values will have ratios closer to 0. This gives us a kind of measurement of where this value should go in the final sorted list. We take this ratio and multiply it by the number of buckets and then get the floor of the whole thing to find the index of the bucket closest to where that value should end up. Let's see where our values end up:

math.floor((19 / 75) * 6) = 1
math.floor((28 / 75) * 6) = 2
math.floor((61 / 75) * 6) = 4
math.floor((32 / 75) * 6) = 2
math.floor((17 / 75) * 6) = 1
math.floor((59 / 75) * 6) = 4
math.floor((48 / 75) * 6) = 3
math.floor(( 4 / 75) * 6) = 0
math.floor((10 / 75) * 6) = 0
math.floor((74 / 75) * 6) = 5
math.floor((39 / 75) * 6) = 3
math.floor((69 / 75) * 6) = 5
#    0         1         2         3         4         5
[ [4, 10], [19, 17], [28, 32], [48, 39], [61, 59], [74, 69] ]

Not too shabby! Now all we need to do is sort each bucket and then concatenate them all together and return it.

Implement it!

Now that you have an understanding of how this works, take a stab at implementing it for sorting a list of integers.

Bonus:

If you get that working, try changing it to a list of float values between zero and one (e.g. 0.334, 0.167, 0.834, etc.). How would you need to modify the bucket index calculation to make it work for a list of fractional float values?

PreviousInsertion SortNextBubble Sort

Last updated 3 years ago

Was this helpful?