Wafer Map Defect Pattern Classification and Image Retrieval Using Convolutional Neural Network

논문 리뷰/공부 2021. 6. 30. 15:33

1. Introduction

- 반도체 제조 공정에서 wafer map은 defect pattern을 시각화하고 잠재적인 공정 이슈를 식별하기 위해 사용됨

- 특정 공정 단계가 끝난 후 검사가 이루어지며, 감지된 die 내 불량의 위치에 기반하여 wafer map 생성됨

- wafer map 시각화의 주 목적은 비정상 불량 시그니처를 감시하고 관련된 공정의 문제를 빠르게 해결하기 위함

- 관련된 root cause와 함께 wafer map library가 만들어진다면, wafer 사이의 불량 패턴 유사성은 root cause를 나타내는 좋은 지표가 될 수 있음

* 효과적인 knowledge base를 만들기 위하여 필요한 두 가지 component

1) wafer map defect pattern classification

2) wafer map image retrieval from historical wafer map libraries

- 1번 과정은 불량 클래스 별 발생률 등의 데이터를 제공하며 엔지니어가 가장 중요한 이슈에 집중할 수 있도록 도와줌

- 2번 과정은 이미 root cause를 알고 있는 historical wafer map에 질문하며 root cause를 식별할 수 있도록 도와줌

* wafer map defect pattern classification

1) model-based pattern recognotion

- 각 불량 패턴의 확률 분포 함수를 미리 정의하고, information criterion을 이용하여 가장 잘 맞는 모델을 선택함

- Akaike information criterion (AIC), Bayesian infromation criterion (BIC) (? 처음 들어봐..)

2) feature extraction based pattern recognition

- correlogram, Radon transform (? 이것도 처음 들어봐...) 등의 방법을 이용하여 pattern feature를 추출

- pattern feature가 추출되면 support vector machine, neural network, nearest neighbor 등 일반적인 패턴 분류 알고리즘을 적용

* deep CNN

- 최근에 state-of-the-art 이미지 분류 성능을 보임

- task-specific feature engineering이 필요하지 않은 end-to-end 모델

- task specific feature extractor를 개발할 필요가 없고, domain specific 전문 지식을 요구하지 않아 이점이 많음

* image retrieval

- 주어진 query ,이미지와 유사한 object나 scene을 포함하고 있는 이미지를 찾는 task

- 전통적으로 image retrieval은 object의 색이나 모양을 이용한 feature extraction을 요구함

- deep CNN이 각 layer에서 풍부한 feature를 학습할 수 있기 때문에 이러한 intermediate feaure를 image retrieval을 위한 좋은 descroptor로 사용할 수 있음

* 본 논문

- defect pattern classification과 wafer map retrieval task에 CNN 적용

- 실제 wafer map 데이터는 굉장히 imbalance

- 28600개의 simulation 데이터 & 1191개의 실제 데이터 이용

- CNN training 및 validation 과정에서는 simulation wafer map 데이터 이용

- 실제 데이터를 이용하여 학습된 CNN의 성능 평가

2. Method

2-1. Wafer Map Pattern Generation

- 적기 시르다

2-2. Convolutional Neural Network Configuration

- input image size : 286 x 400

- 3개의 convolutional layer로 구성, receptive field size : 3 x 3, stride : 1

- 각 convolutional layer마다 rectified linear activation(ReLU) 사용

- max pooling size : 2 x 2

- 256 size의 FC layer, sigmoid activation

- 마지막 FC layer는 defect class 수와 같은 size, softmax 이용해서 class 별 확률 계산

2-3. Wafer Map Image Retrieval Using Convolutional Neural Network

- 이미지는 고차원 데이터이기 때문에 large 데이터셋에서 빠른 검색을 위해서는 차원 축소가 필수

- [9]와 유사한 접근법 사용

- 본 논문의 경우 FC layer의 node size 때문에 latent layer 요구되지 않음

- 256 node를 가진 FC layer에 sigmoid activation를 이용하여 간단하게 feature 추출

- 각 wafer map을 위한 binary code를 얻기 위해 sigmoid activation의 output에 threshold 0.5로 설정

- 전체 wafer map에 대한 binary code library가 만들어지면 query wafer map과 유사한 wafer map을 검색하기 위해 Hamming distance measure 사용

3. Result

3-1. Wafer Map Pattern Generation

- 22개의 불량 패턴 class 설정

- simulated wafer map은 아래 3 종류로 구성

1) pure random defects

2) random defects and typical non-random defects

3) random defects and multiple different non-random defect types

- 가끔 defect location이 특정 공정 설비에 대한 lacational commonality information를 제공하여 특정 이슈를 식별하는데 도움이 되기도 함

- 사분면 line scratch와 non-random cluster class 추가

- defect density map 이용 (? 이게 뭔데...)

Figure 2. The example of the generated wafer map for each class

3-2. Wafer Map Classification Accuracy

- 각 class 별 1300개의 wafer map 데이터 생성, training 700 / validation 300 / test 300

- simulation wafer map 데이터에서 10 epoch 후 training accuracy : 99.8%, validation accuracy : 97.8%

Figure 3. Accuracy confusion matrix in percentage for the simulated test wafer maps

- 위 표는 simalation test wafer map 데이터에서 accuracy confusion matrix

- C5 89.0%, C6 87.7% 제외하고 대부분의 class accuracy는 95% 이상

- 전체 accuracy는 98.2%

Figure 4. The training and validation accuracy

- 각 epoch 당 평균 processing time은 110.6 seconds

Figure 5. The misclassified wafer map (left) and the top 5 class probability (right)

- 위 그림은 잘못 분류된 wafer map 예시와 top 5 class probability

- true class는 C6인 이미지가 C5로 잘못 분류 되었으나, C6이 2번째로 높은 확률을 보임

Figure 6. Accuracy confusion matrix in percentage for the real wafer maps

- 위 표는 실제 wafer map 데이터에서 accuracy confusion matrix

- 실제 데이터셋에는 9개의 class만 있음, 굉장히 imbalance, C1 class가 지배적

- 66.7% 분류 성능 보인 C6은 wafer map 수가 3개 뿐이기 때문

Figure 7. The misclassified wafer map from the real wafer (elft) and the top 5 class probability (right)

- 위 그림은 잘못 분류된 wafer map 예시와 top 5 class probability

3-3. Wafer Map Image Retrieval

Figure 8. Query wafer map and the corresponding top 3 retrieved wafer map images

- 위 그림의 각 열의 첫번째 이미지가 query 이미지, 나머지가 top 3 retrieved wafer map 이미지

- (a)는 simulation wafer, (b)는 실제 wafer를 query 이미지로 준 경우

- 위 표는 image retrieval 성능을 평가하기 위해 top 1 retrieved 이미지 class와 실제 class에 기반한 error rate 계산

- library로부터 유사한 wafer map을 성공적으로 검색하는 것을 확인할 수 있음

- 18000 wafer map library로부터 image retrieval은 이미지 당 0.13 seconds 소요

4. Conclusion

- CNN을 이용하여 wafer map pattern classification과 wafer map image retrieval을 위한 방법 제안

- 실제 반도체 제조 공정에서 rare event detection이 고수율을 유지하기 위해 매우 중요함

- 실제 데이터가 없어도 시뮬레이션을 통해 wafer map을 이론적으로 생성함으로써 rare event detection을 가능하게 함

- CNN의 FC layer에 의해 생성된 binary code를 이용함으로써 CNN based image retrieval의 좋은 성능과 효율성을 보임

- 특정 불량 패턴과 그에 해당하는 root cause, solution을 한번 연결 짓는다면 wafer map image retrieval은 문제가 있는 공정 과정에 조치를 취하는 데 도움이 될 수 있음

본 논문 : Nakazawa, T., & Kulkarni, D. V. (2018). Wafer map defect pattern classification and image retrieval using convolutional neural network. IEEE Transactions on Semiconductor Manufacturing, 31(2), 309–314. https://doi.org/10.1109/TSM.2018.2795466

[9] : K. Lin, H.-F. Yang, J.-H. Hsiao, and C.-S. Chen, “Deep learning of binary hash codes for fast image retrieval,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), Boston, MA, USA, 2015, pp. 27–35.

(이것도 읽고 리뷰했지롱)

* 개인적인 리뷰

- 그래 구조가 이미 정해진 기존 모델 쓰는거 아니면 latent layer 추가할 필요 없이 이 논문처럼 FC layer node 수 조절하고 sigmoid activation 쓰면 될 듯

- 근데 image retrieval 왜 하는건지 아직 이해 안됨.........

- 분류를 할거면 마지막 softmax 층 통과한 결과 쓰면 되잖아........ binary code 구할거여도 어짜피 똑같은 CNN에 넣어야 하는거 맞잖아.................

- 분류 패턴 class~root cause 연관짓는게 아니라 각 이미지 데이터마다 root cause 연관 짓는건가? 그래서 하는건가?

- simulation으로 데이터 생성하는거도 공부해보자

'논문 리뷰 > 공부' 카테고리의 다른 글

A recurrent neural network based health indicator for remaining useful life prediction of bearings (0)	2021.07.08
A variational autoencoder for a semiconductor fault detection model robust to process drift due to incomplete maintenance (0)	2021.07.07
Deep Learning of Binary Hash Codes for Fast Image Retrieval (0)	2021.06.30

ABOUT ME

기록장 기록장

1. Introduction