UE源码解析系列
本文参考UE5.0.3源码
UE GC
UE使用智能指针 管理非UObject对象,使用GC管理UObject对象。
UObject管理
UObject
是UE的基石,其内部继承体系如下:
1 2 3 4 class COREUOBJECT_API UObjectBaseclass COREUOBJECT_API UObjectBaseUtility : public UObjectBaseclass COREUOBJECT_API UObject : public UObjectBaseUtility
其中,UObjectBase
提供底层实现,UObjectBaseUtility
提供功能函数,都不建议在游戏代码中直接使用。
COREUOBJECT_API
是定义为DLLEXPORT
的宏,后者用于导出函数到DLL,在MSVC下定义为__declspec(dllexport)
,在GCC和Clang下定义为__attribute__((visibility("default")))
。
UObjectArray.h
中声明了用于全局UObject
管理的GUObjectArray
和GUObjectClusters
:
1 2 extern COREUOBJECT_API FUObjectArray GUObjectArray; extern COREUOBJECT_API FUObjectClusterContainer GUObjectClusters;
FUObjectArray
内部持有FChunkedFixedUObjectArray
对象,后者内部持有FUObjectItem
二级指针,管理划分为固定大小(\(2^{16}\) )的指针块。FUObjectItem
内部持有UObjectBase
指针:
1 2 3 4 5 6 7 8 struct FUObjectItem { class UObjectBase * Object; int32 Flags; int32 ClusterRootIndex; int32 SerialNumber; };
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 enum EObjectFlags { RF_NoFlags = 0x00000000 , RF_Public =0x00000001 , RF_Standalone =0x00000002 , RF_MarkAsNative =0x00000004 , RF_Transactional =0x00000008 , RF_ClassDefaultObject =0x00000010 , RF_ArchetypeObject =0x00000020 , RF_Transient =0x00000040 , RF_MarkAsRootSet =0x00000080 , RF_TagGarbageTemp =0x00000100 , RF_NeedInitialization =0x00000200 , RF_NeedLoad =0x00000400 , RF_KeepForCooker =0x00000800 , RF_NeedPostLoad =0x00001000 , RF_NeedPostLoadSubobjects =0x00002000 , RF_NewerVersionExists =0x00004000 , RF_BeginDestroyed =0x00008000 , RF_FinishDestroyed =0x00010000 , RF_BeingRegenerated =0x00020000 , RF_DefaultSubObject =0x00040000 , RF_WasLoaded =0x00080000 , RF_TextExportTransient =0x00100000 , RF_LoadCompleted =0x00200000 , RF_InheritableComponentTemplate = 0x00400000 , RF_DuplicateTransient =0x00800000 , RF_StrongRefOnFrame =0x01000000 , RF_NonPIEDuplicateTransient =0x02000000 , RF_Dynamic UE_DEPRECATED (5.0 , "RF_Dynamic should no longer be used. It is no longer being set by engine code." ) =0x04000000 , RF_WillBeLoaded =0x08000000 , RF_HasExternalPackage =0x10000000 , RF_PendingKill UE_DEPRECATED (5.0 , "RF_PendingKill should not be used directly. Make sure references to objects are released using one of the existing engine callbacks or use weak object pointers." ) = 0x20000000 , RF_Garbage UE_DEPRECATED (5.0 , "RF_Garbage should not be used directly. Use MarkAsGarbage and ClearGarbage instead." ) =0x40000000 , RF_AllocatedInSharedPage =0x80000000 , };
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 enum class EInternalObjectFlags : int32 { None = 0 , LoaderImport = 1 << 20 , Garbage = 1 << 21 , PersistentGarbage = 1 << 22 , ReachableInCluster = 1 << 23 , ClusterRoot = 1 << 24 , Native = 1 << 25 , Async = 1 << 26 , AsyncLoading = 1 << 27 , Unreachable = 1 << 28 , PendingKill UE_DEPRECATED (5.0 , "PendingKill flag should no longer be used. Use Garbage flag instead." ) = 1 << 29 , RootSet = 1 << 30 , PendingConstruction = 1 << 31 , GarbageCollectionKeepFlags = Native | Async | AsyncLoading | LoaderImport, MirroredFlags = Garbage | PendingKill, AllFlags = LoaderImport | Garbage | PersistentGarbage | ReachableInCluster | ClusterRoot | Native | Async | AsyncLoading | Unreachable | PendingKill | RootSet | PendingConstruction };
FUObjectClusterContainer
内部持有TArray<FUObjectCluster>
对象,FUObjectCluster
对UObject
进行分组管理以便于GC。
1 2 3 4 5 6 7 8 9 10 struct FUObjectCluster { int32 RootIndex; TArray<int32> Objects; TArray<int32> ReferencedClusters; TArray<int32> MutableObjects; TArray<int32> ReferencedByClusters; bool bNeedsDissolving; };
UObjectBase
的AddObject
函数中先根据EObjectFlags
设置好EInternalObjectFlags
,然后调用GUObjectArray.AllocateUObjectIndex(this);
将自己添加到全局UObject
数组并分配索引,最后调用HashObject(this);
将自己添加到名称哈希表。
GC调用
UE会在固定的时间间隔下自动调用GC(默认61.1秒),可以在UE_5.0\Engine\Config\BaseEngine.ini
中配置GC。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 [/Script/Engine.GarbageCollectionSettings] gc.MaxObjectsNotConsideredByGC =1 gc.SizeOfPermanentObjectPool =0 gc.FlushStreamingOnGC =0 gc.NumRetriesBeforeForcingGC =10 gc.AllowParallelGC =True gc.TimeBetweenPurgingPendingKillObjects =61.1 gc.MaxObjectsInEditor =25165824 gc.IncrementalBeginDestroyEnabled =True gc.CreateGCClusters =True gc.MinGCClusterSize =5 gc.AssetClustreringEnabled =False gc.ActorClusteringEnabled =False gc.BlueprintClusteringEnabled =False gc.UseDisregardForGCOnDedicatedServers =False gc.MultithreadedDestructionEnabled =True gc.VerifyGCObjectNames =True gc.VerifyUObjectsAreNotFGCObjects =False gc.PendingKillEnabled =True
可以通过手动调用GEngine->ForceGarbageCollection(true)
将bFullPurgeTriggered
设置为True,从而强制引擎在UWorld::Tick
在ConditionalCollectGarbage()
中调用GC。
GC流程
加锁
目的是阻止其它线程执行UObject操作。GCLock是不可重入的。
1 2 3 4 5 6 7 8 9 10 11 12 13 void CollectGarbage (EObjectFlags KeepFlags, bool bPerformFullPurge) { AcquireGCLock (); CollectGarbageInternal (KeepFlags, bPerformFullPurge); ReleaseGCLock (); }bool TryCollectGarbage (EObjectFlags KeepFlags, bool bPerformFullPurge) { }
void GCLock()
:自增原子变量GCWantsToRunCounter
,通知其它线程想要执行GC,循环等待AsyncCounter
(若有任何阻塞GC的非游戏线程则非0)归0,然后自增GCCounter
(若GC在执行则非0),创建内存屏障,将GCWantsToRunCounter
置为0。
FPlatformMisc::MemoryBarrier()
用来创建内存屏障,确保内存屏障之前的所有读写操作都在内存屏障之前完成,内存屏障之后的所有读写操作都在内存屏障之后开始。这样可以保证内存操作的顺序性,避免因为指令重排序导致的问题。
bool TryGCLock()
:若是AsyncCounter
非0则返回false。
标记与可达性分析
目的是找出所有不可达的对象。
这里分析CollectGarbageInternal
函数中调用的PerformReachabilityAnalysis
函数,其有两个主要步骤,标记与可达性分析。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 void PerformReachabilityAnalysis (EObjectFlags KeepFlags, const EFastReferenceCollectorOptions InOptions) { FGCArrayStruct* ArrayStruct = FGCArrayPool::Get ().GetArrayStructFromPool (); TArray<UObject*>& ObjectsToSerialize = ArrayStruct->ObjectsToSerialize; GObjectCountDuringLastMarkPhase.Reset (); if (FPlatformProperties::RequiresCookedData () && FGCObject::GGCObjectReferencer && GUObjectArray.IsDisregardForGC (FGCObject::GGCObjectReferencer)) { ObjectsToSerialize.Add (FGCObject::GGCObjectReferencer); } const EFastReferenceCollectorOptions OptionsForMarkPhase = InOptions & ~EFastReferenceCollectorOptions::WithPendingKill; (this ->*MarkObjectsFunctions[GetGCFunctionIndex (OptionsForMarkPhase)])(ObjectsToSerialize, KeepFlags); (this ->*ReachabilityAnalysisFunctions[GetGCFunctionIndex (InOptions)])(ArrayStruct); FGCArrayPool::Get ().ReturnToPool (ArrayStruct); }
上述代码中的MarkObjectsFunctions
数组(长度为4)保存了MarkObjectsAsUnreachable
函数模板的不同实例化,ReachabilityAnalysisFunctions
数组(长度为8)保存了PerformReachabilityAnalysisOnObjectsInternal
函数模板的不同实例化,实例化参数都是非类型参数EFastReferenceCollectorOptions
。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 enum class EFastReferenceCollectorOptions : uint32 { None = 0 , Parallel = 1 << 0 , AutogenerateTokenStream = 1 << 1 , ProcessNoOpTokens = 1 << 2 , WithClusters = 1 << 3 , ProcessWeakReferences = 1 << 4 , WithPendingKill = 1 << 5 , }; MarkObjectsFunctions[GetGCFunctionIndex (EFastReferenceCollectorOptions::None)] = &FRealtimeGC::MarkObjectsAsUnreachable<false , false >; ReachabilityAnalysisFunctions[GetGCFunctionIndex (EFastReferenceCollectorOptions::None)] = &FRealtimeGC::PerformReachabilityAnalysisOnObjectsInternal<EFastReferenceCollectorOptions::None | EFastReferenceCollectorOptions::None>;template <bool bParallel, bool bWithClusters>void MarkObjectsAsUnreachable (TArray<UObject*>& ObjectsToSerialize, const EObjectFlags KeepFlags) { }struct FGCReferenceInfo { union { struct { uint32 ReturnCount : 8 ; uint32 Type : 5 ; uint32 Offset : 19 ; }; uint32 Value; }; };template <EFastReferenceCollectorOptions CollectorOptions>void PerformReachabilityAnalysisOnObjectsInternal (FGCArrayStruct* ArrayStruct) { }
清理
目的是清理不可达的对象。
CollectGarbageInternal
函数中调用IncrementalPurgeGarbage
函数进行清理。后者调用的UnhashUnreachableObjects
函数调用了UObject::ConditionalBeginDestroy()
,IncrementalDestroyGarbage
函数调用了UObject::ConditionalFinishDestroy()
。它们都没有真正调用析构函数,析构函数在TickDestroyObjects
和TickDestroyGameThreadObjects
中调用。
Conclusion
基于引用计数的GC是实时的,但无法解决使用不当造成的循环引用。基于可达性的GC是非实时的,可以解决循环引用问题。
C++11的智能指针和微软的COM对象使用基于引用计数的GC。
UE使用基于可达性的GC,为标记-清除式。
Java虚拟机使用基于可达性的GC,采用分代机制(GC频率不同),新生代采用标记-复制式,老年代采用标记-整理式。
CPython使用基于引用计数的GC,辅助使用循环引用检测算法和分代机制。
Lua使用基于可达性的GC,为增量标记-清除式。
参考资料
原创 UE基础—Garbage
Collection(垃圾回收) - 知乎 (zhihu.com)
虚幻4垃圾回收剖析
- 风恋残雪 - 博客园 (cnblogs.com)
UE4
ReferenceTokenStream - 知乎
大象无形——虚幻引擎程序设计浅析 / 罗丁力,张三(10.1.4)
Java性能优化之JVM
GC(垃圾回收机制) - 知乎 (zhihu.com)
GC 机制探究之 Python
篇 - 知乎 (zhihu.com)
Lua
垃圾回收 菜鸟教程 (runoob.com)
云风的 BLOG:
Lua GC 的工作原理 (codingnow.com)