Built-in data structures

Reading time3 min

In brief

Article summary

When one wants to store more than one value in a variable, a data structure is needed. High-level programming languages, like Python, propose (among others) four built-in data structures: lists, sets, tuples and dictionaries.

This article briefly describes the specificities of each of these data types.

Main takeaways

  • A list is used to store multiple indexed values and may evolve.

  • A set stores multiple unique and unchangeable values in an unordered manner and may evolve.

  • A tuple stores multiple unchangeable indexed values.

  • A dictionary is structure to store key/value pairs.

Article contents

1 — What is a list?

The list type is undoubtedly the most commonly used data structure for storing multiple values in a single variable. Its main properties are:

  • Values in a list are ordered (it is a sequence), which means that each value is associated with an index from 0 to size of the list minus one.
  • A list allows for duplicate values.
  • A list is a mutable data type, thus allowing for the modification of its values.
  • A list is dynamic, meaning that values can be added or deleted.
  • A list allows for slice selection.
# Define a list
my_structure = [1, 2, 3, 4, 5]
print(f"Initial list: {my_structure} with id {id(my_structure)}")

# Illustration of dynamic property
my_structure.append(6)

# Illustration of mutable property
my_structure[0] = 2
print(f"Modified list: {my_structure} still with id {id(my_structure)}")

# Illustration of slice selection
print(f"Two last elements: {my_structure[-2:]}")
import java.util.ArrayList;
import java.util.List;

public class ExampleList {
    public static void main(String[] args) {
        List<Integer> T = new ArrayList<>();
        T.add(1);
        T.add(2);
        T.add(3);
        T.add(4);
        T.add(5);

        System.out.println("Initial list: " + T + " with ref " + System.identityHashCode(T));

        T.add(6); // Appending an integer instead of a string
        T.set(0, 2); // Modifying the first element

        System.out.println("Modified list: " + T + " still with ref " + System.identityHashCode(T));
        System.out.println("Two last elements: " + T.subList(T.size() - 2, T.size()));
    }
}
Output
Initial list: [1, 2, 3, 4, 5] with id 138495927307072
Modified list: [2, 2, 3, 4, 5, 6] still with id 138495927307072
Two last elements: [5, 6]
TODO

2 — What is a set?

A set has the same properties as those of the mathematical set theory. Its main properties are:

  • Values in a set are unordered, subsequently a value cannot be changed.
  • A set does not allow for duplicate values.
  • A set is a mutable data type, thus allowing for the addition or deletion of values.
# Define a set
my_structure = {1, 3, 5, 5}
print(f"Initial set: {my_structure} with id {id(my_structure)}")

# Add an element
my_structure.add(6)
print(f"Modified set: {my_structure} with id {id(my_structure)}")

# Test if an element belongs to the set
print(f"Contains 4: {4 in my_structure}")
import java.util.HashSet;
import java.util.Set;

public class ExampleSet {
    public static void main(String[] args) {
        Set<Integer> T = new HashSet<>();
        T.add(1);
        T.add(3);
        T.add(5);
        T.add(5); // Duplicate element, will not be added

        System.out.println("Initial set: " + T + " with hash code " + System.identityHashCode(T));
        T.add(6);
        System.out.println("Modified set: " + T + " with hash code " + System.identityHashCode(T));
        System.out.println("Contains 4: " + T.contains(4));
    }
}
Output
Initial set: {1, 3, 5} with id 127760255235264
Modified set: {1, 3, 5, 6} with id 127760255235264
Contains 4: False
TODO

3 — What is a tuple?

A tuple is used to store multiple values that do not evolve. Its main properties are:

  • Values in a tuple are ordered (it is a sequence).
  • A tuple allows for duplicate values.
  • A tuple is an imutable data type, thus its content cannot evolve.
# Initialize a tuple
my_structure = (1, "3", 3.14)
print(f"Initial tuple: {my_structure} with id {id(my_structure)}")
print(f"Second element: {my_structure[1]}")

# Extend the tuple into a new variable
my_structure += ("4",)
print(f"New tuple: {my_structure} with id {id(my_structure)}")
import java.util.Arrays;

public class ExampleTuple {
    public static void main(String[] args) {
        Object[] T = {1, "3", 3.14};
        System.out.println("Initial tuple: " + Arrays.toString(T) + " with hash code " + System.identityHashCode(T));
        System.out.println("Second element: " + T[1]);
        
        Object[] newT = Arrays.copyOf(T, T.length + 1);
        newT[newT.length - 1] = "4";
        System.out.println("New tuple: " + Arrays.toString(newT) + " with hash code " + System.identityHashCode(newT));
    }
}
Output
Initial tuple: (1, '3', 3.14) with id 132604205639104
Second element: 3
New tuple: (1, '3', 3.14, '4') with id 132604205672832
TODO

4 — What is a dictionary?

A dictionary, also called associative array, hash or map, indexes values using keys in order to give a kind of semantics to each stored value. It can be seen as a surjective function $dict : key \rightarrow value$. Its main properties are:

  • A dictionnary is used to store key/value pairs.
  • A dictionary does not allow for key duplicates.
  • A dictionary is a mutable data type, thus allowing for the modification of its values.
  • A dictionary is dynamic, meaning that key/value pairs can be added or deleted at any time.

Here is an example of a program that indexes the words of a text in a dictionary:

# Define the text to index
text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed lacinia, nibh a tincidunt luctus, tortor nunc molestie ante, ut egestas est sapien quis orci. Nunc laoreet eget ipsum id rhoncus. Curabitur vel nibh bibendum, eleifend nibh gravida, tempus massa. Donec id felis blandit, pretium nisl ut, sagittis ligula. Curabitur vitae accumsan ex. Pellentesque interdum condimentum sem, eu aliquet arcu mollis ac. Mauris pellentesque lobortis mi, eget aliquam mauris gravida vel. Nulla quis dui vel arcu malesuada vulputate. Praesent nisi ex, tincidunt fermentum augue eu, cursus vehicula ipsum. Nam vel diam gravida, ultrices eros sed, cursus libero. Etiam maximus tincidunt metus in maximus."

# Process the text
words = text.split()
indexes = {}
for i, word in enumerate(words):
    if word not in indexes:
        indexes[word] = []
    indexes[word].append(i)
print(indexes)
// Needed imports
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

/**
 * To run this code, you need to have Java installed on your computer, then:
 * - Create a file named `Main.java` in a directory of your choice.
 * - Copy this code in the file.
 * - Open a terminal in the directory where the file is located.
 * - Run the command `javac Main.java` to compile the code.
 * - Run the command `java -ea Main` to execute the compiled code.
 * Note: '-ea' is an option to enable assertions in Java.
 */
public class Main {

    /**
     * This function was translated from the Python code.
     * This is also the case for the contents of the "main" function.
     *
     * @param text The text to index.
     * @return     The dictionary containing the indexes of all the words in the text.
     */
    public static Map<String, List<Integer>> indexWords(String text) {
        // Split the text into words
        String[] words = text.split("\\s+");
        // Create a map to store the indexes
        Map<String, List<Integer>> indexes = new HashMap<>();

        // Loop over the words and populate the index map
        for (int i = 0; i < words.length; i++) {
            String word = words[i];
            // If the word is not yet in the map, add it
            if (!indexes.containsKey(word)) {
                indexes.put(word, new ArrayList<>());
            }
            // Append the current index to the list
            indexes.get(word).add(i);
        }

        return indexes;
    }

    /**
     * This is the entry point of your program.
     * It contains the first codes that are going to be executed.
     *
     * @param args Command line arguments received.
     */
    public static void main(String[] args) {
        // Let's test the method
        String text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed lacinia, nibh a tincidunt luctus, tortor nunc molestie ante, ut egestas est sapien quis orci. Nunc laoreet eget ipsum id rhoncus. Curabitur vel nibh bibendum, eleifend nibh gravida, tempus massa. Donec id felis blandit, pretium nisl ut, sagittis ligula. Curabitur vitae accumsan ex. Pellentesque interdum condimentum sem, eu aliquet arcu mollis ac. Mauris pellentesque lobortis mi, eget aliquam mauris gravida vel. Nulla quis dui vel arcu malesuada vulputate. Praesent nisi ex, tincidunt fermentum augue eu, cursus vehicula ipsum. Nam vel diam gravida, ultrices eros sed, cursus libero. Etiam maximus tincidunt metus in maximus.";
        Map<String, List<Integer>> index = indexWords(text);
        System.out.println(index);
    }

}
Output
{'Lorem': [0], 'ipsum': [1, 27], 'dolor': [2], 'sit': [3], 'amet,': [4], 'consectetur': [5], 'adipiscing': [6], 'elit.': [7], 'Sed': [8], 'lacinia,': [9], 'nibh': [10, 32, 35], 'a': [11], 'tincidunt': [12, 80, 98], 'luctus,': [13], 'tortor': [14], 'nunc': [15], 'molestie': [16], 'ante,': [17], 'ut': [18], 'egestas': [19], 'est': [20], 'sapien': [21], 'quis': [22, 71], 'orci.': [23], 'Nunc': [24], 'laoreet': [25], 'eget': [26, 65], 'id': [28, 40], 'rhoncus.': [29], 'Curabitur': [30, 48], 'vel': [31, 73, 88], 'bibendum,': [33], 'eleifend': [34], 'gravida,': [36, 90], 'tempus': [37], 'massa.': [38], 'Donec': [39], 'felis': [41], 'blandit,': [42], 'pretium': [43], 'nisl': [44], 'ut,': [45], 'sagittis': [46], 'ligula.': [47], 'vitae': [49], 'accumsan': [50], 'ex.': [51], 'Pellentesque': [52], 'interdum': [53], 'condimentum': [54], 'sem,': [55], 'eu': [56], 'aliquet': [57], 'arcu': [58, 74], 'mollis': [59], 'ac.': [60], 'Mauris': [61], 'pellentesque': [62], 'lobortis': [63], 'mi,': [64], 'aliquam': [66], 'mauris': [67], 'gravida': [68], 'vel.': [69], 'Nulla': [70], 'dui': [72], 'malesuada': [75], 'vulputate.': [76], 'Praesent': [77], 'nisi': [78], 'ex,': [79], 'fermentum': [81], 'augue': [82], 'eu,': [83], 'cursus': [84, 94], 'vehicula': [85], 'ipsum.': [86], 'Nam': [87], 'diam': [89], 'ultrices': [91], 'eros': [92], 'sed,': [93], 'libero.': [95], 'Etiam': [96], 'maximus': [97], 'metus': [99], 'in': [100], 'maximus.': [101]}
{mollis=[59], ante,=[17], sem,=[55], vel.=[69], adipiscing=[6], est=[20], cursus=[84, 94], eleifend=[34], accumsan=[50], Pellentesque=[52], egestas=[19], augue=[82], nunc=[15], mi,=[64], orci.=[23], mauris=[67], maximus.=[101], id=[28, 40], Nulla=[70], molestie=[16], vel=[31, 73, 88], elit.=[7], ut=[18], lobortis=[63], Etiam=[96], luctus,=[13], in=[100], lacinia,=[9], fermentum=[81], vulputate.=[76], sed,=[93], eu=[56], Sed=[8], pellentesque=[62], vitae=[49], ipsum=[1, 27], amet,=[4], felis=[41], Curabitur=[30, 48], laoreet=[25], tempus=[37], Mauris=[61], ac.=[60], arcu=[58, 74], dolor=[2], bibendum,=[33], metus=[99], rhoncus.=[29], ex,=[79], vehicula=[85], ex.=[51], diam=[89], sit=[3], interdum=[53], a=[11], nisi=[78], tortor=[14], blandit,=[42], tincidunt=[12, 80, 98], nisl=[44], ipsum.=[86], ultrices=[91], condimentum=[54], aliquam=[66], massa.=[38], ut,=[45], Praesent=[77], nibh=[10, 32, 35], gravida,=[36, 90], Donec=[39], libero.=[95], Lorem=[0], aliquet=[57], malesuada=[75], pretium=[43], sapien=[21], Nunc=[24], dui=[72], quis=[22, 71], Nam=[87], eu,=[83], consectetur=[5], eget=[26, 65], sagittis=[46], ligula.=[47], gravida=[68], eros=[92], maximus=[97]}

To go further

Looks like this section is empty!

Anything you would have liked to see here? Let us know on the Discord server! Maybe we can add it quickly. Otherwise, it will help us improve the course for next year!

To go beyond