Sort with complex conditions

The task is something below:

Given a list of items, each item with a directory and a create_time, my task is to sort those items by two rules:

  • first rule: a given index_dict, only contains part of the items.
  • second rule: when the first rule is not exist, use create_time.

Here is an example:


# The given items
[
   %{ "location" => "/folder1", "create_time" => "2019-03-01" },
   %{ "location" => "/folder1/folder1-folder1", "create_time" => "2019-03-02" },
   %{ "location" => "/folder1/folder1-folder1/file1", "create_time" => "2019-03-10" },
   %{ "location" => "/folder1/folder1-folder1/file2", "create_time" => "2019-03-01" },
   %{ "location" => "/folder1/folder1-folder1/file3", "create_time" => "2019-03-03" },
   %{ "location" => "/folder1/folder1-folder1/file4", "create_time" => "2019-03-02" },

   %{ "location" => "/folder2", "create_time" => "2019-01-01" },
   %{ "location" => "/folder2/folder2-folder1", "create_time" => "2019-01-20" },
   %{ "location" => "/folder2/folder2-folder1/file1", "create_time" => "2019-01-22" },
   %{ "location" => "/folder2/folder2-folder1/file2", "create_time" => "2019-01-21" },

   %{ "location" => "/folder2/folder2-folder2", "create_time" => "2019-01-10" },
   %{ "location" => "/folder2/folder2-folder2/file1", "create_time" => "2019-01-11" },
   %{ "location" => "/folder2/folder2-folder2/file2", "create_time" => "2019-01-12" },

   %{ "location" => "/folder3", "create_time" => "2019-02-01" },
   %{ "location" => "/folder3/folder3-folder1", "create_time" => "2019-02-10" },
   %{ "location" => "/folder3/folder3-folder1/file1", "create_time" => "2019-02-02" },
   %{ "location" => "/folder3/folder3-folder1/file2", "create_time" => "2019-02-01" },

   %{ "location" => "/folder3/folder3-folder2", "create_time" => "2019-02-01" },
   %{ "location" => "/folder3/folder3-folder2/file1", "create_time" => "2019-02-03" },
   %{ "location" => "/folder3/folder3-folder2/file2", "create_time" => "2019-02-04" },

   %{ "location" => "/folder3/folder3-folder3", "create_time" => "2019-02-03" },
   %{ "location" => "/folder3/folder3-folder3/file1", "create_time" => "2019-02-05" },
   %{ "location" => "/folder3/folder3-folder3/file2", "create_time" => "2019-02-04" },

   %{ "location" => "/folder3/folder3-folder4", "create_time" => "2019-02-02" }
]


# The given index_dict, 
# PAY attention, they have different levels, and the map keys have no orders .

%{
  "/folder1/folder1-folder1" => [
      "/folder1/folder1-folder1/file1",
      "/folder1/folder1-folder1/file2"
   ],
  "/folder2/folder2-folder1" => [
      "/folder2/folder2-folder1/file1",
      "/folder2/folder2-folder1/file2",
   ],
  "/folder2/folder2-folder2" => [
      "/folder2/folder2-folder2/file1",
      "/folder2/folder2-folder2/file2"
   ],
  "/folder3" => [
      "/folder3/folder3-folder1",
      "/folder3/folder3-folder2"
   ],
}

The final expected order of the given items is


['/',
 '/folder2',
 '/folder2/folder2-folder2',
 '/folder2/folder2-folder2/file1',
 '/folder2/folder2-folder2/file2',
 '/folder2/folder2-folder1',
 '/folder2/folder2-folder1/file1',
 '/folder2/folder2-folder1/file2',
 '/folder3',
 '/folder3/folder3-folder1',
 '/folder3/folder3-folder1/file2',
 '/folder3/folder3-folder1/file1',
 '/folder3/folder3-folder2',
 '/folder3/folder3-folder2/file1',
 '/folder3/folder3-folder2/file2',
 '/folder3/folder3-folder4',
 '/folder3/folder3-folder3',
 '/folder3/folder3-folder3/file2',
 '/folder3/folder3-folder3/file1',
 '/folder1',
 '/folder1/folder1-folder1',
 '/folder1/folder1-folder1/file1',
 '/folder1/folder1-folder1/file2',
 '/folder1/folder1-folder1/file4',
 '/folder1/folder1-folder1/file3']

Let me explain

“/folder1”, “/folder2” and “/folder3” are not in the index_dict, so they are sorted by create_time:
“/folder2” > “/folder3” > “/folder1”

then, Let’s loot at the subfolders of folder2
“/folder2/folder2-folder1” and “/folder2/folder2-folder2” are also not in the index_dict (values), so they are sorted by create_time
“/folder2/folder2-folder2” > “/folder2/folder2-folder1”

then, their subdirectories (here are files)
although “/folder2/folder2-folder1/file2” > “/folder2/folder2-folder1/file1” by create_time, in the index_dict, “file1” > “file2”, so the result is ‘/folder2/folder2-folder1/file1’ > ‘/folder2/folder2-folder1/file2’.

Another example, let’s look at folder3, they have sub folders in the index_dict, so sub folders need to be sorted like that
‘/folder3/folder3-folder1’ > ‘/folder3/folder3-folder2’, then ‘/folder3/folder3-folder4’ > ‘/folder3/folder3-folder3’, by their create_time.

My python code is

class Node:
    def __init__(self, path: str):
        self.path = path
        self.children = {}

class MultiTree:
    def __init__(self, root="/"):
        self.root = Node(root)

    def build(self, lst: list):
        for p in lst:
            f = p.split("/")
            pointer = self.root
            for i in range(1, len(f)):
                path = "/".join(f[:i+1])
                if path not in pointer.children:
                    node = Node(path)
                    pointer.children[path] = node
                    pointer = node
                else:
                    pointer = pointer.children[path]

    @staticmethod
    def dfs(root, index_dict):
        stack, res = [], []
        stack.append(root)
        while len(stack):
            curr = stack.pop()
            if curr.path not in res:
                res.append(curr.path)
            children = list(curr.children.values())
            if curr.path in index_dict:
                index_list = index_dict[curr.path]
                for item in curr.children.values():
                    if item.path not in index_list:
                        index_list.append(item.path)
                children = sorted(children, key=lambda x:index_list.index(x.path))
            stack.extend(reversed(children))
        return res

# data is the given items
# index is the given index_dict
sorted_data = sorted(data, key=lambda x:x["create_time"])
sorted_loc = [w["location"] for w in sorted_data]
tree = MultiTree()
tree.build(sorted_loc)
ft = tree.dfs(tree.root, index)

# ft is the final expected order above.

However, i can’t do it by Elixir, i am really frustrated though i am new to it…
Maybe i need some more exercise…

If anyone who met this issue before, i feel grateful if you give me some advice.


The question is from here :
Continuing the discussion from How to implement a multiple tree?

1 Like

That post is dense, can you slim it down to a simple case of code of what something currently ‘is’ and what you expect it to be? :slight_smile:

Complex sorting is generally quite easily done via Enum.sort_by though?

1 Like

I think so, let me update…

Enum.sort_by is not enough…

The edited OP is not a sorting task, it is a sorting and transformation task, should probably update the title. ^.^;

One note of some ambiguity in the post, where does the '/' element come from in your output as it does not exist in # The given items? Also index_dict keeps being referenced, but what is that?

Ops, i can’t update it~

the / element in the output is actually the root (of any other directories), we can simply ignore it.

index_dict is the additional info takes impact to the sorting task. Let’s say, it’s like a sorting rule.

This problem has been solved.

I use Hierarchical approach, that is, deal with each level of directory from root.

Thanks to the forum~

If anyone is interested, feel free to contact me, and i will post my solution.