All of the paging tables, at all levels, have to be 4KiB aligned. If you use large entries (>4KiB) then those entries have to be aligned to their size. So for example:
PD = 4KiB
PDE = 4KiB (if it points to PT) or larger if it points to a large page
EDIT2: Am i even allowed to set the PL2P and so on to a virtual address? I mean as long as i don't disable Paging it should work... It's just gonna get critical when i set CR3, right?
As for the entries in all levels of the paging structure, they all have to be physical memory addresses, no virtual addresses allowed. How would the CPU know how to translate a virtual address at any level without knowing physical address at some point? Or maybe I misunderstood again =)
But i still want to do it statically because then i can directly map the whole PM4T to itself
Not sure what you mean here. If you do all the paging statically then you will be severely restricted, for example in number of processes (address spaces really). However if you do do it statically and are not going to modify the tables at runtime then you don't even have to have the page tables themselves mapped to virtual address space as you won't be touching them.
Note, statically allocating them means that the binary size will increase by your 66MiB, of course they use rudimentary compression where the binary will contain a 66MiB BSS section, however the loader (whether normal app loader in an OS or a boot loader in this instance) will need to allocate that memory. Suppose the system doesn't have a contiguous 66MiB physical memory area? Also some boot loaders might have some restrictions on the size as well, not sure.
Dynamic allocation is much more flexible and it's actually very easy. Split physical and virtual memory into two, PMM and VMM. The PMM only keeps track of 4KiB pages and doesn't care about anything else. The VMM only does virtual memory management and when requested to "add memory" to some process it modifies the requested virtual address (page) and requests a 4KiB frame from the PMM and maps the VMM address to that page. When de-allocating memory it does the reverse and the PMM adds that memory frame (4KiB) to it's storage (stack and bitmap are most common I think).
If you haven't already, lookup recursive mapping, basically you map the _current_ paging structures at the end of the _current_ virtual address space. That way the last virtual pages always point to the physical memory of the current paging structures. That way you always know how to find them for the current process. It may take a moment to wrap your head around the concept but it's actually quite intuitive once you get used to the idea and quite easy and efficient.
I didn't fully go over your code, but remember that when your code is loaded by the boot loader all the static allocations (globals) will be physical addresses. Once you enable paging all you ever access is virtual addresses, with the obvious exception that what you provide to CR3 is physical and all references in all the paging structures refer to physical addresses as far as the CPU is concerned.
Note. In some parts I've been referring to the 32-bit paging because it's easier to talk about, but everything works the same with long mode paging with respect to the things I've mentioned.
If you haven't read these already:
http://wiki.osdev.org/Paging
http://wiki.osdev.org/Page_Tables#Recursive_mapping
Check especially the second one as it talks about recursive mapping.