---

layout: post
title: "JVM內存管理"
category: Reading Notes

tags: ["读文章", "Java", "JVM"]

{% include JB/setup %}

JVM內存管理

JVM內存分配

JVM的內存分配大致如下圖:

JVM內存管理

其中黃色的是線程共享的,白色是線程私有的。

Memory blocks can be reached in one of two ways if the use program holds a reference to that block in a root, or if there is a reference to that block held in another reachable block.

程序計數器(program counter register)

這個可以認為是一個運行指針,指向下一個要運行的命令。用來實現循環、判讀、線程切換等功能。

Java虛擬機棧(Java stack)

用來存放函數運行狀態,比如函數的變量、返回值類型、動態鏈接等。每個函數開始運行,就會創建一個棧幀栈帧中包含了局部变量表,操作数栈,方法出口等信息。栈帧然後隨著函數返回,而消亡。

本地方法棧(native method stack)

用來存放native函數的棧,和Java stack功能差不多

堆(heap)

存放幾乎所有object的地方,是一個非常打的內存區域,可以不連續。是GC的主戰場。

方法區(method area)

這部分是存放類的類型數據,比如一個class 的子父類、接口等,以及常量和靜態變量。這部分比較少變化,有人稱他為永久代。GC也會在這裡做一些收集,但是效果一般。

方法区中有一块非常重要的区域运行时常量池,Java Class的文件结构中有一个叫常量池的结构,它主要存放了编译器生成的各种字面常量和符号引用,这部分的内容也将放到运行时常量池

HotSpot JVM内存实现模型

JVM內存管理

HotSpot JVM将堆分成如下三部分:

  • 新生代(Young)

    新生代被划分成三部分,Eden区和两个大小严格相同的Survivor区,其中Survivor区间,某一时刻只有其中一个是被使用的,另一个留作垃圾收集时复制对象用。在Young区间变满的时候,minor GC就会将存活的对象移到空闲的Survivor区间中,根据JVM的策略,在经过几次垃圾收集后,仍然存活于Survivor的对象将被移动到Tenured区间。

  • 老年代(Tenured)

    Tenured区主要保存生命周期长的对象,一般是一些老的对象,当一些对象在Young复制转移一定的次数以后,对象就会被转移到Tenured区,一般如果系统中用了application级别的缓存,缓存中的对象往往会被转移到这一区间。

  • 持久代(Perm)

    Perm代主要保存classmethodfiled对象,这部分的空间一般不会溢出,除非一次性加载了很多的类,不过在设计到热部署的应用服务器的时候,有时候会遇到java.lang.OutOfMemoryError : PermGen space的错误,造成这个错误的很大原因就有可能是每次都重新部署,但是重新部署后,类的class没有被卸载掉,这样就造成了大量的class对象保存在了perm中,这种情况下,一般重新启动应用服务器可以解决问题。

HotSpot JVM提供的参数来对内存进行配置:

  • 配置总内存

    -Xms:指定了JVM初始启动以后初始化内存

    -Xmx:指定了JVM堆的最大内存,在JVM启动以后,会分配-Xmx参数指定大小的内存给JVM,但是不一定全部使用,JVM会根据-Xms参数来调节真正用于JVM的内存

  • 配置新生代

    -Xmn: 参数设置了年轻代的大小

    -XX:SurvivorRatio: 表示eden和一个surivivor的比例,缺省值为8。假如-XX:SurvivorRatio=32意味着eden和一个survivor的比值是32:1,这样一个Survivor就占Young区的1/34.

  • 配置老年代

    -XX:NewRatio: 表示年老年代新生代内存的比例,缺省值为2.假如-XX:NewRatio=8意味着tenuredyoung的比值8:1

  • 配置持久代

    -XX:MaxPermSize:表示持久代的最大值

GC

GC方法

Java採用了根搜索方法GC,一般是從GC根開始全部搜索一遍,沒有被搜索到的,都是垃圾,可以被清理。這種算法可以消除循環引用造成的影響。也就是循環引用的對象依然會被GC。GC root包括以下四種:

  • 棧中的變量引用的對象
  • 方法區中的靜態屬性引用的對象
  • 方法區中的常量引用的對象
  • JNI引用的對象

垃圾收集策略

  • Reference Counting(引用计数)

    这种方式在每一个对象中增加一个引用的计数,这个计数代表当前程序有多少个引用引用了此对象,如果此对象的引用计数变为0,那么此对象就可以作为垃圾收集器的目标对象来收集。

    Each object has an associated reference count the number of active references to that object. Each time a pointer reference is modified, such as through an assignment statement, or when a reference goes out of scope, the compiler must generate code to update the referenced object's reference count.

    优点:简单,直接,不需要暂停整个应用

    缺点:需要编译器的配合,编译器要生成特殊的指令来进行计数的操作,比如每次将对象赋值给新的引用,或者对象的引用超出了作用域等。

    None of the standard garbage collectors in the JDK uses reference counting; instead, they all use come form of tracing collector.

  • 跟踪收集器

    跟踪收集器首先要暂停整个应用程序,然后开始从根对象扫描整个堆,判断扫描的对象是否有对象引用。 如果每次扫描整个堆,那么势必让GC的时间变长,从而影响了应用本身的执行。因此在JVM里面采用了分代收集,在新生代收集的时候minor gc只需要扫描新生代,而不需要扫描老生代。

    JVM采用了分代收集以后,minor gc只扫描新生代,但是minor gc怎么判断是否有老生代的对象引用了新生代的对象,JVM采用了卡片标记的策略,卡片标记将老生代分成了一块一块的,划分以后的每一个块就叫做一个卡片,JVM采用卡表维护了每一个块的状态,当JAVA程序运行的时候,如果发现老生代对象引用或者释放了新生代对象的引用,那么就JVM就将卡表的状态设置为脏状态,这样每次minor gc的时候就会只扫描被标记为脏状态的卡片,而不需要扫描整个堆

JVM內存管理

在Java中的引用可以分为一下几种:
  1. Strong Reference(强引用)

强引用是JAVA中默认采用的一种方式,我们平时创建的引用都属于强引用。如果一个对象没有强引用,那么对象就会被回收。

  1. Soft Reference(软引用)

软引用的对象在GC的时候不会被回收,只有当内存不够用的时候才会真正的回收,因此软引用适合缓存的场合,这样使得缓存中的对象可以尽量的再内存中待长久一点。

  1. Weak Reference(弱引用)

弱引用有利于对象更快的被回收,假如一个对象没有强引用只有弱引用,那么在GC后,这个对象肯定会被回收。

  1. Phantom reference(幽灵引用)

幽灵引用说是引用,但是你不能通过幽灵引用来获取对象实例,它主要目的是为了当设置了幽灵引用的对象在被回收的时候可以收到通知。

GC算法

GC有多個算法,適用不同的環境

標記清除

the most basic form of tracing collector

也就是標記垃圾,然後全部清除。缺點是:效率不高、空間碎片

从根扫描每个活跃的对象,然后标记扫描过的对象,标记完成以后,清除那些没有被标记的对象。

the world is stopped and the collector visits each live node, starting from the roots, and marks each node it visits

When there are no more references to follow, collection is complete, and then the heap is swept (every object in the heap is examined), and any object not marked is reclaimed as garbage and returned to the free list.

优点:解决循环引用的问题;不需要编译器的配合

缺点:每个活跃的对象都要进行扫描 (every active object, whether reachable or not, is visited during the sweep phase),收集暂停的时间比较长。

Because a significant percentage of objects are likely to be garbage, this means that the collector is spending considerable effort examining and handling garbage.

Mark-sweep collectors also tend to have the heap fragmented, which can cause locality issues and can also cause allocation failures even when sufficient free memory appears to be available.

複製

another form of tracing collector.

內存一分為二,只使用一塊(某一时刻,只有一个空间处于活跃的状态),然後在GC時把一塊內容中存活的對象拷貝到另一個區域。缺點是浪費了一半內存

When the active space fills up, the world is stopped and live objects are copied from the active space into the inactive space. The roles of the spaces are then flipped, with the old inactive space becoming the new active space.

优点:只扫描可以到达的对象,不需要扫描所有的对象,从而减少了应用暂停的时间

only visiting live objects, which means garbage objects will not be examined, nor will they need to be paged into memory or brought into the cache.

the set of live objects are compacted into the bottom of the heap. This not only improves locality of reference of the user program and eliminates heap fragmentation, but also greatly reduces the cost of object allocation object allocation becomes a simple pointer addition on the top-of-heap pointer.

缺点:需要额外的空间消耗,某一时刻,总是有一块内存处于未使用状态;复制对象需要一定的开销

標記整理

The copying algorithm has excellent performance characteristics, but it has the drawback of requiring twice as much memory as a mark sweep collector.

和標記清除類似,但是不清除,而是把存活的對象向一邊移動,這樣就避免了碎片化。

它分两个阶段执行,在第一个阶段,首先扫描所有活跃的对象,并标记所有活跃的对象,第二个阶段首先清除未标记的对象,然后将活跃的对象复制到堆的底部。

Mark-compact is a two phase process, where each live object is visited and marked in the marking phase. Then, marked objects are copied such that all the live objects are compacted at the bottom of the heap. If a complete compaction is performed at every collection, the resulting heap is similar to the result of a copying collector there is a clear demarcation between the active portion of the heap and the free area, so that allocation costs are comparable to a copying collector. Long-lived objects tend to accumulate at the bottom of the heap, so they are not copied repeatedly as they are in a copying collector.

分代收集

意思就是堆分成多個區域,每個區域採用不同的算法。一般分為新生代和老年代。新生就是存活不久的對象,老年代反之。

新生代一般採用複製算法,老年代採用標記整理或者標記清除算法。

HotSpot JVM垃圾收集策略

GC执行时要耗费一定的CPU资源和时间

串行收集器(Serial Collector)

Serial Collector是指任何时刻都只有一个线程进行垃圾收集。

它需要停止整个应用的执行。这种类型的收集器适合于单CPU的机器。

Serial Collector有如下两个:

  • Serial Copying Collector:

    此种GC用-XX:UseSerialGC选项配置,它只用于新生代对象的收集。

    -XX:MaxTenuringThreshold来设置对象复制的次数。当eden空间不够的时候,GC会将eden的活跃对象和一个名叫From survivor空间中尚不够资格放入Old代的对象复制到另外一个名字叫To Survivor空间。而此参数就是用来说明到底From survivor中的哪些对象不够资格,假如这个参数设置为31,那么也就是说只有对象复制31次以后才算是有资格的对象。

From SurvivorTo Survivor的角色是不断的变化的,同一时间只有一块空间处于使用状态,这个空间就叫做From Survivor区,当复制一次后角色就发生了变化。

如果复制的过程中发现To Survivor空间已经满了,那么就直接复制到Old Generation

比较大的对象也会直接复制到Old Generation,在开发中,我们应该尽量避免这种情况的发生。

并行收集器(Parallel Collector)

Parallel Collector主要是为了应对多CPU,大数据量的环境

Parallel Collector又可以分为以下三种:

  • Parallel Copying Collector

    此种GC用-XX:UseParNewGC参数配置,它主要用于新生代的收集,此GC可以配合CMS一起使用

  • Parallel Mark-Compact Collector

    此种GC用-XX:UseParallelOldGC参数配置,此GC主要用于老生代对象的收集。

  • Parallel scavenging Collector

    此种GC用-XX:UseParallelGC参数配置,它是对新生代对象的垃圾收集器,但是它不能和CMS配合使用,它适合于比较大新生代的情况

并发收集器(Concurrent Collector)

Concurrent Collector通过并行的方式进行垃圾收集,这样就减少了垃圾收集器收集一次的时间,在HotSpot JVM中,我们称之为CMS GC,这种GC在实时性要求高于吞吐量的时候比较有用。此种GC可以用参数-XX:UseConcMarkSweepGC配置,此GC主要用于老生代和Perm代的收集。

CMS GC有可能出现并发模型失败:

并发模型失败:我们CMS GC在运行的时候,用户线程也在运行,当gc的速度比新增对象的速度慢的时候,或者说当正在GC的时候,老年代的空间不能满足用户线程内存分配的需求的时候,就会出现并发模型失败,出现并发模型失败的时候,JVM会触发一次stop-the-world的Full GC这将导致暂停时间过长。不过CMS GC提供了一个参数-XX:CMSInitiatingOccupancyFraction来指定当老年代的空间超过某个值的时候即触发GC。因此如果此参数设置的过高,可能会导致更多的并发模型失败。

并发和并行收集器区别:

对于并发和并行收集器,我们需要注意一点:并发收集器是指垃圾收集器线程和应用线程可以并发的执行,也就是清除的时候不需要stop the world,但是并行收集器指的的是可以多个线程并行的进行垃圾收集,并行收集器还是要暂停应用的(即所谓的stop the world

下面我们通过下图来形象的描述一下哪些收集器可以配对使用

不同情况下的垃圾收集配置策略

吞吐量优先

吞吐量是指GC的时间与运行总时间的比值,比如系统运行了100分钟,而GC占用了一分钟,那么吞吐量就是99%。

吞吐量优先一般运用于对响应性要求不高的场合,比如web应用,因为网络传输本来就有延迟的问题,GC造成的短暂的暂停使得用户以为是网络阻塞所致。

吞吐量优先可以通过-XX:GCTimeRatio来指定。当通过-XX:GCTimeRatio不能满足系统的要求以后,我们可以更加细致的来对JVM进行调优。

首先因为要求高吞吐量,这样就需要一个较大的Young generation,此时就需要引入Parallel scavenging Collector,可以通过参数:-XX:UseParallelGC来配置。

java -server -Xms3072m -Xmx3072m -XX:NewSize=2560m -XX:MaxNewSize=2560 XX:SurvivorRatio=2 - XX:+UseParallelGC

控制并行的线程数

缺省情况下,Parallel scavenging Collector会开启与cpu数量相同的线程进行并行的收集,但是也可以调节并行的线程数。假如你想用4个并行的线程去收集Young generation的话,那么就可以配置-XX:ParallelGCThreads=4,此时JVM的配置参数如下:

java -server -Xms3072m -Xmx3072m -XX:NewSize=2560m -XX:MaxNewSize=2560 XX:SurvivorRatio=2 -XX:+UseParallelGC -XX:ParallelGCThreads=4

自动调节新生代

在采用了Parallel scavenge collector后,此GC会根据运行时的情况自动调节survivor ratio来使得性能最优,因此Parallel scavenge collector应该总是开启此参数。此时JVM的参数配置如下:

java -server -Xms3072m -Xmx3072m -XX:+UseParallelGC -XX:ParallelGCThreads=4 -XX:+UseAdaptiveSizePolicy
响应时间优先

响应时间优先是指GC每次运行的时间不能太久,这种情况一般使用与对及时性要求很高的系统,比如股票系统等。

响应时间优先可以通过参数-XX:MaxGCPauseMillis来配置,配置以后JVM将会自动调节年轻代,老生代的内存分配来满足参数设置。

在一般情况下,JVM的默认配置就可以满足要求,只有默认配置不能满足系统的要求时候,才会根据具体的情况来对JVM进行性能调优。如果采用默认的配置不能满足系统的要求,那么此时就可以自己动手来调节。此时
Young generation可以采用
Parallel copying collector,而Old generation则可以采用Concurrent Collector.

举个例子来说,以下参数设置了新生代用Parallel Copying Collector,老生代采用CMS收集器。

java -server -Xms512m -Xmx512m  -XX:NewSize=64m -XX:MaxNewSize=64m -XX:SurvivorRatio=2  -XX:+UseConcMarkSweepGC -XX:+UseParNewGC

此时需要注意两个问题:

1.如果没有指定-XX:+UseParNewGC,则采用默认的非并行版本的copy collector.

2.如果在一个单CPU的系统上设置了-XX:+UseParNewGC ,则默认还是采用缺省的copy collector.

控制并行的线程数

默认情况下,Parallel copy collector启动和CPU数量一样的线程,也可以通过参数-XX:ParallelGCThreads来指定,比如你想用3个线程去进行并发的复制收集,那么可以改变上述参数如下:

java -server -Xms512m -Xmx512m -XX:NewSize=64m  -XX:MaxNewSize=64m -XX:SurvivorRatio=2        -XX:ParallelGCThreads=4 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC

控制并发收集的临界值

默认情况下,CMS gc在old generation空间占用率高于68%的时候,就会进行垃圾收集,而如果想控制收集的临界值,可以通过参数:-XX:CMSInitiatingOccupancyFraction来控制,比如改变上述的JVM配置如下:

java -server -Xms512m -Xmx512m -XX:NewSize=64m -XX:MaxNewSize=64m -XX:SurvivorRatio=2  -XX:ParallelGCThreads=4 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=35

此外顺便说一个参数:-XX:+PrintCommandLineFlags通过此参数可以知道在没有显示指定内存配置和垃圾收集算法的情况下,JVM采用的默认配置。

When it comes to garbage collection we play with 3 major variables that set targets for collectors:

  • Throughput: The amount of work done by an application as a ratio of time spent in GC. Target throughput with -XX:GCTimeRatio = 99; 99 is the default equating to 1% GC time
  • Latency: The time taken by systems in responding to events which is impacted by pauses introduced by garbage collection. Target latency for GC pauses with -XX:MaxGCPauseMillis = <n>
  • Memory: The amount of memory our systems use to store state, which is often copied and moved around when being managed. The set of active objects remained by the application at any point in time is known as the Live Set.

We should not interpret the term 'real-time' to mean the lowest possible latency; rather real time refers to having deterministic latency regardless of throughput.

Tradeoffs often play out as follows:

  • To a large extent the cost of garbage collection, as an amortized cost, can be reduced by providing the garbage collection algorithms with more memory
  • The observed worst case latency inducing pauses due to garbage collecting can be reduced by containing the live set and keeping the heap size small
  • The frequency with which pauses occur can be reduced by managing the heap and generation sizes, and by controlling the application's object allocation rate
  • The frequency of large pauses can be reduced by concurrently running the GC with the application, sometimes at the expense of throughput

Garbage colletion algorithms are often optimised with the expectation that most objects live for a very short period of time, while relatively few live for very long.

Hotspot garbage collectors record the age of an object in terms of the number of GC cycles survived.

Older generations are less sparse, and as a result the efficiency of older generation collection algorithms tends to be much lower. Generational garbage collectors tend to operate in two distinct collection cycles: Minor collections, when short lived objects are collected, and the less frequent Major collections, when the older regions are collected.

Stop-The-World Events

For garbage collectors to operate 'Stop-The-World Event' is necessary, for practical engineering reasons, to periodically stop the running application so that memory can be managed.

To bring an application to a total stop it is necessary to pause all the running threads. Garbage collectors do this by signaling the threads to stop when they come to a 'safepoint', which is a point during program execution at which all GC roots are known and all heap object contents are consistent.

Depending on what a thread is doing it may take some time to reach a safepoint. Safepoint checks are normally performed on method returns and loop back edges, but can be optimized away in some places making them more dynamically rare.

Time To Safepoint (TTSP) is an important consideration in low-latency applications. This time can be surfaced by enabling the -XX:+PrintGCApplicationStoppedTime flag in addition to the other GC flags

Eden is the region where most objects are initially allocated. The survivor spaces are a temporary store for objects that have survived a collection of the Eden space. Collectively Eden and Survivor spaces are known as the 'young' and 'new' generation.

Objects that live long enough are eventually promoted to the tenured space.

The perm generation is where the runtime stores objects it 'knows' to be effectively immortal, such as Classes and static Strings.

Object Allocation

To avoid contention each thread is assigned a Thread Local Allocation Buffer (TLAB) from which it allocates objects. Using TLABs allows object allocation to scale with number of threads by avoiding contention on a single memory resource.

When a TLAB is exhausted a thread simply requests a new one from the Eden space. When Eden has been filled a minor collection commences.

Minor Collections

A minor collection is triggered when Eden becomes full. This is done by copying all the live objects in the new generation to either a survivor space or the tenured space as appropriate.Copying to the tenured space is known as promotion or tenuring. Promotion occurs for objects that are sufficiently old (-XX:MaxTenuringThreshold=<n>), or when the survivor space overflows.

Live objects are objects that are reachable by the application, any other objects cannot be reached and can therefore be considered dead. In a minor collection, the copying of live objects is performed by first following what are known as GC Roots, and iteratively copying anything reachable to the survivor space.

In generational collection, the GC Roots for the new generation's reachable object graph also include any references from the old generation to the new generation. These references must also be processed to make sure all reachable objects in the new generation survive the minor collection. Identifying these cross-generational references is achieved by use of a 'card table'.

There are two survivor spaces in the Hotspot new generation, which alternate in their 'to-space' and 'from-space' roles. At the beginning of a minor collction, the to-space survivor space is always empty, and acts as a target copy area for the minor collection. The previous minor collection's target survivor space is part of the from-space, which also includes Eden, where live objects that need to be copied may be found.

The cost of a minor GC collection is usually dominated by the cost of copying obejcts to the survivor and tenured spaces. The work done during a minor collection is directly proportional to the number of live objects found, and not to the size of the new generation. The totla time spent doing mimor collections can be almost be halved each time the Eden size is doubled. Memory can therefore be traded for throughput. A doubling of Eden size will result in an increase in collection time per-collection cycle, but this is relatively small if both the number of objects being promoted and size of the old generation is constant.

Major collections

Major collections collect the old generation so that objects can be promoted from the young generation.

The old generation collector will try to predict when it needs to collect to avoid a promotion failure from the young generation. The collectors track a fill threshold for the old generation and begin collection when this threshold is passed. If this threshold is not sufficient to meet promotion requirements then a 'FullGC' is triggered. A FullGC involves promoting all live objects from the young generations followed by a collection and compaction of the old generation.

To avoid promotion failure you will need to tune the padding that the old generation allows to accommodate promotions (-XX:PromotedPadding=<n>)

Serial Collector

It uses a single thread for both minor and major collections. Objects are allocated in the tenured space using a simple bump the pointer algorithem. Major collections are triggered when the tenured space is full.

Reference:

http://imtiger.net/blog/2010/02/21/jvm-memory-and-gc-2/

http://imtiger.net/blog/2010/02/21/jvm-memory-and-gc-1/

http://www.ibm.com/developerworks/library/j-jtp10283/index.html?S_TACT=105AGX52&S_CMP=cn-a-j

2015/10/25

---

layout: post
title: "Java Classloader机制"
category: Reading Notes

tags: ["读文章", “Java”]

{% include JB/setup %}

Java Classloader机制

Java类加载器基于三个机制:委托、可见性和单一性。

  • 委托机制是指将加载一个类的请求交给父类加载器,如果这个父类加载器不能够找到或者加载这个类,那么再加载它。
  • 可见性的原理是子类的加载器可以看见所有的父类加载器加载的类,而父类加载器看不到子类加载器加载的类。
  • 单一性原理是值仅加载一个类一次,这是由委托机制确保子类加载器不会再次加载父类加载器加载过的类。

什么是类加载器

类加载器是一个用来加载类文件的类。Java源代码通过javac编译器编译成类文件,然后JVM来执行类文件中的字节码来执行程序。类加载器负责加载文件系统、网络或者其他来源的类文件。

有三种默认使用的类加载器:Bootstrap类加载器Extension类加载器System类加载器(或者叫做Application类加载器)。每种类加载器都有设定好从哪里加载类。

类加载器的工作原理

委托机制

假设你有一个应用需要的类叫作Abc.class

  • 首先加载这个类的请求由Application类加载器委托给它的父类加载器Extension类加载器
  • 然后再委托给Bootstrap类加载器Bootstrap类加载器会先看看rt.jar中有没有这个类,
  • 因为并没有这个类,所以这个请求由回到Extension类加载器,它会查看jre/lib/ext目录下有没有这个类,
  • 如果这个类被Extension类加载器找到了,那么它将被加载,而Application类加载器不会加载这个类;
  • 而如果这个类没有被Extension类加载器找到,那么再由Application类加载器classpath中寻找。记住classpath定义的是类文件的加载目录,而PATH是定义的是可执行程序如javac,java等的执行路径。
可见性机制

根据可见性机制,子类加载器可以看到父类加载器加载的类,而反之则不行。

下面的例子中,当Abc.class已经被Application类加载器加载过了,然后如果想要使用Extension类加载器ClassLoaderTest.class.getClassLoader().getParent())加载这个类,将会抛出java.lang.ClassNotFoundException异常。

import java.util.logging.Level;
import java.util.logging.Logger;

/**
* Java program to demonstrate How ClassLoader works in Java,
* in particular about visibility principle of ClassLoader.
*
* @author Javin Paul
*/
public class ClassLoaderTest {
    public static void main(String args[]) {
        try {          
            //printing ClassLoader of this class
            System.out.println("ClassLoaderTest.getClass().getClassLoader() : "
                             + ClassLoaderTest.class.getClassLoader());

            //trying to explicitly load this class again using Extension class loader
            Class.forName("test.ClassLoaderTest", true
                        ,  ClassLoaderTest.class.getClassLoader().getParent());
        } catch (ClassNotFoundException ex) {
            Logger.getLogger(ClassLoaderTest.class.getName()).log(Level.SEVERE, null, ex);
        }
    }
}
单一性机制

父加载器加载过的类不能被子加载器加载第二次。

Java中ClassLoader的加载采用了双亲委托机制,采用双亲委托机制加载类的时候采用如下的几个步骤:

  • 当前ClassLoader首先从自己已经加载的类中查询是否此类已经加载,如果已经加载则直接返回原来已经加载的类

    每个类加载器都有自己的加载缓存,当一个类被加载了以后就会放入缓存,等下次加载的时候就可以直接返回了。

  • 当前ClassLoader的缓存中没有找到被加载的类的时候,委托父类加载器去加载,父类加载器采用同样的策略,首先查看自己的缓存,然后委托父类的父类去加载,一直到bootstrap ClassLoader

  • 当所有的父类加载器都没有加载的时候,再由当前的类加载器加载,并将其放入它自己的缓存中,以便下次有加载请求的时候直接返回。

命名空间

要确定一个类,需要类的全限定名以及加载此类的ClassLoader来共同确定。也就是说即使两个类的全限定名是相同的,但是因为不同的ClassLoader加载了此类,那么在JVM中它是不同的类。

采用了委托模型以后加大了不同的ClassLoader的交互能力,比如hashmaplinkedlist等,这些类由bootstrap类加载器加载了以后,无论你程序中有多少个类加载器,这些类都是可以共享的,这样就避免了不同的类加载器加载了同样名字的不同类以后造成的混乱。

自定义ClassLoader

loadClass方法
public Class<?> loadClass(String name) throws ClassNotFoundException {
     return loadClass(name, false);
}

protected synchronized Class<?> loadClass(String name, boolean resolve)
  throws ClassNotFoundException
{
    // First, check if the class has already been loaded
    Class c = findLoadedClass(name);//检查class是否已经被加载过了
    if (c == null) {
          try {
          if (parent != null) {
              c = parent.loadClass(name, false); //如果没有被加载,且指定了父类加载器,则委托父加载器加载。
          } else {
              c = findBootstrapClass0(name);//如果没有父类加载器,则委托bootstrap加载器加载
          }
          } catch (ClassNotFoundException e) {
              // If still not found, then invoke findClass in order
              // to find the class.
              c = findClass(name);//如果父类加载没有加载到,则通过自己的findClass来加载。
          }
      }
      if (resolve) {
          resolveClass(c);
      }
      return c;
}

public Class<?> loadClass(String name) throws ClassNotFoundException没有被标记为final,也就意味着我们是可以override这个方法的,也就是说双亲委托机制是可以打破的

findClass

java.lang.ClassLoader的源代码,我们发现findClass的实现如下:

protected Class<?> findClass(String name) throws ClassNotFoundException {
  throw new ClassNotFoundException(name);
}

此方法默认的实现是直接抛出异常,我们可以override这个方法。

我们在写自己的ClassLoader的时候,如果想遵循双亲委托机制,则只需要override findClass

defineClass

defineClass的源代码:

protected final Class<?> defineClass(String name, byte[] b, int off, int len)
  throws ClassFormatError{
      return defineClass(name, b, off, len, null);
}

从上面的代码我们看出此方法被定义为了final,这也就意味着此方法不能被Override,其实这也是jvm留给我们的唯一的入口,通过这个唯一的入口,jvm保证了类文件必须符合Java虚拟机规范规定的类的定义。此方法最后会调用native的方法来实现真正的类的加载工作。

你完全可以自己写一个classLoader来加载自己写的java.lang.String类,但是你会发现也不会加载成功,具体就是因为针对java.*开头的类,jvm的实现中已经保证了必须由bootstrp来加载。

Reference:

http://imtiger.net/blog/2009/11/09/java-classloader/

http://www.importnew.com/6581.html

2015/10/25

---

layout: post
title: "Java创建一个不可变的类"
category: Reading Notes

tags: ["读文章", "Java"]

{% include JB/setup %}

创建一个不可变的类

  • 将类声明为final,所以它不能被继承
  • 将所有的成员声明为私有的,这样就不允许直接访问这些成员
  • 对变量不要提供setter方法
  • 将所有可变的成员声明为final,这样只能对它们赋值一次
  • 通过构造器初始化所有成员,进行深拷贝(deep copy)
  • 在getter方法中,不要直接返回对象本身,而是克隆对象,并返回对象的拷贝

    package com.journaldev.java;
     
    import java.util.HashMap;
    import java.util.Iterator;
     
    public final class FinalClassExample {
     
        private final int id;
     
        private final String name;
     
        private final HashMap testMap;
     
        public int getId() {
            return id;
        }
     
        public String getName() {
            return name;
        }
     
        /**
         * 可变对象的访问方法
         */
        public HashMap getTestMap() {
            //return testMap;
            return (HashMap) testMap.clone();
        }
     
        /**
         * 实现深拷贝(deep copy)的构造器
         * @param i
         * @param n
         * @param hm
         */
     
        public FinalClassExample(int i, String n, HashMap hm){
            System.out.println("Performing Deep Copy for Object initialization");
            this.id=i;
            this.name=n;
            HashMap tempMap=new HashMap();
            String key;
            Iterator it = hm.keySet().iterator();
            while(it.hasNext()){
                key=it.next();
                tempMap.put(key, hm.get(key));
            }
            this.testMap=tempMap;
        }
     
        /**
         * 实现浅拷贝(shallow copy)的构造器
         * @param i
         * @param n
         * @param hm
         */
        /**
        public FinalClassExample(int i, String n, HashMap hm){
            System.out.println("Performing Shallow Copy for Object initialization");
            this.id=i;
            this.name=n;
            this.testMap=hm;
        }
        */
     
        /**
         * 测试浅拷贝的结果
         * 为了创建不可变类,要使用深拷贝
         * @param args
         */
        public static void main(String[] args) {
            HashMap h1 = new HashMap();
            h1.put("1", "first");
            h1.put("2", "second");
     
            String s = "original";
     
            int i=10;
     
            FinalClassExample ce = new FinalClassExample(i,s,h1);
     
            //Lets see whether its copy by field or reference
            System.out.println(s==ce.getName());
            System.out.println(h1 == ce.getTestMap());
            //print the ce values
            System.out.println("ce id:"+ce.getId());
            System.out.println("ce name:"+ce.getName());
            System.out.println("ce testMap:"+ce.getTestMap());
            //change the local variable values
            i=20;
            s="modified";
            h1.put("3", "third");
            //print the values again
            System.out.println("ce id after local variable change:"+ce.getId());
            System.out.println("ce name after local variable change:"+ce.getName());
            System.out.println("ce testMap after local variable change:"+ce.getTestMap());
     
            HashMap hmTest = ce.getTestMap();
            hmTest.put("4", "new");
     
            System.out.println("ce testMap after changing variable from accessor methods:"+ce.getTestMap());
     
        }
     
    }
    

输出:

Performing Deep Copy for Object initialization
true
false
ce id:10
ce name:original
ce testMap:{2=second, 1=first}
ce id after local variable change:10
ce name after local variable change:original
ce testMap after local variable change:{2=second, 1=first}
ce testMap after changing variable from accessor methods:{2=second, 1=first}
2015/10/25

---

2015/10/25

---

layout: post
title: "Stack的三种含义"
category: Reading Notes

tags: ["读文章", "原理"]

{% include JB/setup %}

Stack的三种含义

第一种: 数据结构

特点是LIFO

第二种: 代码运行方式

调用栈(call stack)

表示函数或子程序像堆积木一样存放,以实现层层调用。

class Student{
    int age;              
    String name;      

    public Student(int Age, String Name)
    {
        this.age = Age;
        setName(Name);
    }
    public void setName(String Name)
    {
        this.name = Name;
    }
}

public class Main{
    public static void main(String[] args) {
            Student s;           
            s = new Student(23,"Jonh");
    }
}

stack.png

第三种L内存区域

存放数据的一种内存区域。程序运行的时候,需要内存空间存放数据。一般来说,系统会划分出两种不同的内存空间:一种叫做stack(栈),另一种叫做heap(堆)。

它们的主要区别是:stack是有结构的,每个区块按照一定次序存放,可以明确知道每个区块的大小;heap是没有结构的,数据可以随意存放。因此,stack的寻址速度要快于heap。

一般来说,每个线程分配一个stack,每个进程分配一个heap。也就是说,stack是线程独占的,heap是线程公用的。stack创建的时候,大小是确定的,数据超过这个大小,就发生stack overflow错误,而heap的大小是不确定的,需要的话可以不断增加。

数据存放的规则是:只要是局部的、占用空间确定的数据,一般都存放在stack里面,否则就放在heap里面。

public void Method1()
{
    int i=4;

    int y=2;

    class1 cls1 = new class1();
}

stack1.png

当Method1方法运行结束

整个stack被清空,i、y和cls1这三个变量消失,因为她们是局部变量,区块一旦运行结束,就没有必要再存在了。而heap之中的那个对象实例继续存在,直到系统的垃圾清理机制将这块内存回收。因此,一般来说,内存泄露都发生在heap,即某些内存空间不再被使用了,却因为种种原因,没有被系统回收。

Reference:

http://www.ruanyifeng.com/blog/2013/11/stack.html

2015/10/25