How does the make "-j" option actually work?
From the man pages:
-j [jobs], --jobs[=jobs] Specifies the number of jobs (commands) to run simultaneously. If there is more than one -j option, the last one is effective. If the -j option is given without an argument, make will not limit the number of jobs that can run simultaneously.
I know it uses a graph dependency to know which rules are independent.
I would like to know how this graph is built and understand what are the criteria used.
The dependency graph is based, as one would expect, on the prerequisites listed for each Makefile target. make will build a graph where the targets and prerequisites are vertices and there is a directed edge from the prereqs to their targets. In this way the number of incoming edges tells you how many prereqs a target has. If it has no incoming edges, then it has no prerequisites.
The vertices for the .c and .h files, for example, will have no incoming edges. Those files are your source files and do not need to be built.
It then performs a topological sort on the graph to determine the order of execution. From Wikipedia:
The canonical application of topological sorting (topological order) is in scheduling a sequence of jobs or tasks; topological sorting algorithms were first studied in the early 1960s in the context of the PERT technique for scheduling in project management (Jarnagin 1960). The jobs are represented by vertices, and there is an edge from x to y if job x must be completed before job y can be started (for example, when washing clothes, the washing machine must finish before we put the clothes to dry). Then, a topological sort gives an order in which to perform the jobs.
The gist of a topological sort is to find the vertices with no incoming edges (no dependencies) and put those first. Then remove them from the graph. Now you'll have a new set of vertices with no incoming edges (no dependencies). Those are next. And so on until finished. (If you ever reach a point when there are no such vertices then the dependency graph contains a cycle, which is an error condition.)
In a typical Makefile this means you'll first build the source files (nothing needs to be done). Then the object files that depend on those source files. Then the libraries and executables built from those object files.
Under normal non-parallel operation make will simply pick a single target each iteration and build it. When it is parallel it will grab as many dependency-less targets as it can and build them in parallel, up to the number of permitted simultaneous jobs.
So when make gets to, say, the object file step, it will have a large number of vertices in the graph that all have no incoming edges. It knows it can build the object files in parallel and so it forks off n copies of gcc to build the object files.
I suspect you're expecting something more magical than there actually is here. Makefiles contain lines like:
target: prereq1 prereq2 prereq3 ...
This defines a relationship between files in the system; in graph parlance, each white-space-separated word on the line implicitly declares a node in the graph, and a directed edge is created between each of the nodes to the left of the colon and each of the nodes to the right of the colon, pointing from the latter to the former.
From there it's a simple matter of traversing the graph to find nodes that have no incoming edges and executing the commands associated with those nodes, then working back 'up' the graph from there.
Hope that helps.