<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>IT 뉴비 생존기</title>
    <link>https://hopulence.tistory.com/</link>
    <description></description>
    <language>ko</language>
    <pubDate>Wed, 8 Apr 2026 15:20:04 +0900</pubDate>
    <generator>TISTORY</generator>
    <ttl>100</ttl>
    <managingEditor>Hopulence</managingEditor>
    <image>
      <title>IT 뉴비 생존기</title>
      <url>https://tistory1.daumcdn.net/tistory/3568595/attach/f9bd4549091045cb9fb0bfec0d5ae128</url>
      <link>https://hopulence.tistory.com</link>
    </image>
    <item>
      <title>Istio envoy 패킷 유실 튜닝 기록</title>
      <link>https://hopulence.tistory.com/55</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;배경&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1G 환경에서 istio가 패킷 포워딩을 처리하는 과정에서 packet drop이 많이 발생하여, 이를 해결하는 과정을 기록하였습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;1. 튜닝 전 테스트&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;보통 서비스를 위한 네트워크의 패킷 유실율은 10^(-6) = 0.000001% 이하여야 합니다.&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;물리 인터페이스에서 약 0.6% 드랍 발생&lt;/li&gt;
&lt;li&gt;istio의 calico rx/tx에서 6%, 3.5% 드랍 발생.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;706&quot; data-origin-height=&quot;382&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/SlUzt/dJMcaaYFUNd/QbbPCpntDSVf8xC5LcXVg0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/SlUzt/dJMcaaYFUNd/QbbPCpntDSVf8xC5LcXVg0/img.png&quot; data-alt=&quot;물리 인터페이스 패킷 드랍율&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/SlUzt/dJMcaaYFUNd/QbbPCpntDSVf8xC5LcXVg0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FSlUzt%2FdJMcaaYFUNd%2FQbbPCpntDSVf8xC5LcXVg0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;348&quot; height=&quot;188&quot; data-origin-width=&quot;706&quot; data-origin-height=&quot;382&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;물리 인터페이스 패킷 드랍율&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;ethtool 통계 확인
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;fx_fifo_errors: NIC의 FIFO 큐에서 Ring buffer로 옮기기 전에 큐가 가득 차서 발생한 에러&lt;/li&gt;
&lt;li&gt;rx_missed_errors: FIFO 큐 또는 Ring buffer가 넘쳐서 패킷이 드랍된 횟수&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1772170578734&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# ethtool -S eno1
     rx_missed_errors: 19882559
     ...
     rx_fifo_errors: 19882559&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1772177466333&quot; class=&quot;routeros&quot; style=&quot;background-color: #f8f8f8; color: #383a42; text-align: start;&quot; data-ke-type=&quot;codeblock&quot; data-ke-language=&quot;shell&quot;&gt;&lt;code&gt;# /Document/networking/statistics
What:		/sys/class/&amp;lt;iface&amp;gt;/statistics/rx_missed_errors
...
Description:
		Indicates the number of received packets that have been missed
		due to lack of capacity in the receive side. See the network
		driver for the exact meaning of this value.&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;SoftIRQ&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1772172170007&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# awk '{for (i=1; i&amp;lt;=NF; i++) printf strtonum(&quot;0x&quot; $i) (i==NF?&quot;\n&quot;:&quot; &quot;)}' /proc/net/softnet_stat | column -t
# 수신 프레임     time_squeeze                cpu_number
512787    0  3  0  0  0  0  0  0  0  0  0  0
495932    0  3  0  0  0  0  0  0  0  0  0  1
476620    0  3  0  0  0  0  0  0  0  0  0  2
522209    0  3  0  0  0  0  0  0  0  0  0  3&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;NIC utilization 확인
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;NIC의 대역폭은 여유가 있는데 패킷 드랍이 발생한 상황입니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1772179053391&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# sar -n DEV 1
10:47:13 PM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s   %ifutil
10:47:15 PM        lo    505.50    505.50     71.79     71.79      0.00      0.00      0.00      0.00
10:47:15 PM      eno1 215271.50 240560.50  73207.84  82766.38      0.00      0.00      1.50     67.80&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;결론
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;CPU가 NIC buffer에서 패킷을 polling하는 속도가 패킷 유입 속도를 따라가지 못해서 드랍이 발생했습니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;2. OS 튜닝&lt;/h3&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;TCP 커널 버퍼 수정&lt;/h4&gt;
&lt;pre id=&quot;code_1772170619500&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.ipv4.tcp_rmem=4096 87380 16777216
net.ipv4.tcp_wmem=4096 65536 16777216
net.core.netdev_max_backlog=5000
net.core.netdev_budget=1000
net.core.netdev_budget_usecs=10000&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;ring buffer 수정&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;ethtool -G eno1 rx 4096 tx 4064 명령어로 변경 가능하지만, 재부팅 시 초기화 됩니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1772170653310&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# ethtool -g eno1
Ring parameters for eno1:
Pre-set maximums:
RX:		4096
RX Mini:	n/a
RX Jumbo:	n/a
TX:		4096
Current hardware settings:
RX:		4096
RX Mini:	n/a
RX Jumbo:	n/a
TX:		4096&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;CPU frequency 변경&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;cpupower frequency-set --governor performance로 수정할 수 있으며 재부팅 시 초기화됩니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1772170738940&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# cpupower frequency-info
analyzing CPU 0:
  ...
  current CPU frequency: 2.80 GHz (asserted by call to hardware)&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;* Packet fragment 발생 여부&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Calico에서 VxLAN이나 IPIP를 사용하는 경우 outer header가 추가로 붙어서 물리 인터페이스의 MTU를 초과할 수 있습니다.&lt;/li&gt;
&lt;li&gt;그러나 Calico interface의 default MTU는 IPIP인 경우 1480, VxLANㅇ니 경우 1450으로 맞춰서 설정됩니다.&lt;/li&gt;
&lt;li&gt;따라서 단편화에 대한 해당 사항은 없습니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;CPU - NIC 매핑(&lt;span style=&quot;color: #333333; text-align: start;&quot;&gt;NUMA aware&lt;span&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;irqbalance 데몬이 켜져 있는 상태라 패킷이 NIC로 인입되면 모든 코어로 균일하게 irq를 보냅니다.&lt;br /&gt;(2 소켓 CPU 서버에 NIC가 단중화라 패킷이 균일하게 처리되면 QPI를 거치게 되므로 성능 저하가 발생)&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1772182091509&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# mpstat -P ALL 1
Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
Average:     all    5.84    0.00    1.09    0.00    0.00    1.54    0.00    0.00    0.00   91.53
Average:       0    5.40    0.00    0.95    0.06    0.00    1.97    0.00    0.00    0.00   91.62
Average:       1    5.95    0.00    0.98    0.00    0.00    0.77    0.00    0.00    0.00   92.30
Average:       2    5.40    0.00    0.68    0.00    0.00    0.53    0.00    0.00    0.00   93.39
Average:       3    5.46    0.00    0.74    0.00    0.00    0.44    0.00    0.00    0.00   93.36
Average:       4    7.00    0.00    1.18    0.00    0.00    1.12    0.00    0.00    0.00   90.69
Average:       5    5.69    0.00    0.77    0.00    0.00    1.74    0.00    0.00    0.00   91.80
Average:       6    6.70    0.00    1.56    0.00    0.00    1.59    0.00    0.00    0.00   90.14
Average:       7    6.57    0.00    1.66    0.00    0.00    1.60    0.00    0.00    0.00   90.17
Average:       8    6.10    0.00    1.36    0.00    0.00    1.30    0.00    0.00    0.00   91.23
Average:       9    1.53    0.00    0.24    0.00    0.00   28.17    0.00    0.00    0.00   70.06
Average:      10    7.66    0.00    1.74    0.00    0.00    1.57    0.00    0.00    0.00   89.03
Average:      11    6.31    0.00    1.48    0.00    0.00    2.22    0.00    0.00    0.00   89.98
Average:      12    7.15    0.00    1.51    0.00    0.00    1.21    0.00    0.00    0.00   90.13&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;NIC RSS 설정 확인
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&amp;nbsp;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1772522119090&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# ethtool -x eno1
RX flow hash indirection table for eno1 with 8 RX ring(s):
    0:      0     0     0     0     0     0     0     0
    8:      0     0     0     0     0     0     0     0
   16:      1     1     1     1     1     1     1     1
   24:      1     1     1     1     1     1     1     1
   32:      2     2     2     2     2     2     2     2
   40:      2     2     2     2     2     2     2     2
   48:      3     3     3     3     3     3     3     3
   56:      3     3     3     3     3     3     3     3
   64:      4     4     4     4     4     4     4     4
   72:      4     4     4     4     4     4     4     4
   80:      5     5     5     5     5     5     5     5
   88:      5     5     5     5     5     5     5     5
   96:      6     6     6     6     6     6     6     6
  104:      6     6     6     6     6     6     6     6
  112:      7     7     7     7     7     7     7     7
  120:      7     7     7     7     7     7     7     7&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;irqbalance 데몬을 stop하고 인터럽트 요청이 동일한 NUMA의 CPU를 바라보도록 설정합니다.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;NUMA 확인&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1772182317281&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# ethtool -i eno1
...
bus-info: 0000:63:00.0

# lspci -vvv | grep Eth -A 7
63:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
	...
	NUMA node: 0
    
# lscpu
NUMA:
  NUMA node(s):          2
  NUMA node0 CPU(s):     0-31,64-95
  NUMA node1 CPU(s):     32-63,96-127&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;인터럽트가 어떤 코어에 쏠리는지 확인
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;32코어 CPU의 NUMA에 맞게 인터럽트 요청이 되고 있음.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1772512337022&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# cat /proc/interrupts | grep eno1
            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7       CPU8       CPU9       CPU10      CPU11      CPU12      CPU13      CPU14      CPU15      CPU16      CPU17      CPU18      CPU19      CPU20      CPU21      CPU22      CPU23      CPU24      CPU25      CPU26      CPU27      CPU28      CPU29      CPU30      CPU31      CPU32      CPU33      CPU34      CPU35      CPU36      CPU37      CPU38      CPU39      CPU40      CPU41      CPU42      CPU43      CPU44      CPU45      CPU46      CPU47      CPU48      CPU49      CPU50      CPU51      CPU52      CPU53      CPU54      CPU55      CPU56      CPU57      CPU58      CPU59      CPU60      CPU61      CPU62      CPU63      CPU64      CPU65      CPU66      CPU67      CPU68      CPU69      CPU70      CPU71      CPU72      CPU73      CPU74      CPU75      CPU76      CPU77      CPU78      CPU79      CPU80      CPU81      CPU82      CPU83      CPU84       CPU85      CPU86      CPU87      CPU88      CPU89      CPU90      CPU91      CPU92      CPU93      CPU94      CPU95      CPU96      CPU97      CPU98      CPU99      CPU100     CPU101     CPU102     CPU103     CPU104     CPU105     CPU106     CPU107     CPU108     CPU109     CPU110     CPU111     CPU112     CPU113     CPU114     CPU115     CPU116     CPU117     CPU118     CPU119     CPU120     CPU121     CPU122     CPU123     CPU124     CPU125     CPU126     CPU127
 225:          0          0          0          0          0          1          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          1          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI 51904512-edge      eno1
 226:   83502188          0          0          0          0          0         12          0          0          0          0     229624          0          0     106538          0          0          0          0          0          0     310412          0          0     115826          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0    8725845          0          0          0          0          0          0          0          0          0          0     356583          0          0      92022          0          0          0          0          0          0      41963          0          0     120484          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI 51904513-edge      eno1-TxRx-0
 227:          0          0       4329          0      22058   79974858          0       5876          0      29406          0      40150       8842          0      39321          0          0          0       9747          0          0      63250      11937          0      65301          0          0          0      12216          0      12576          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0        144          0       6323          0      17652          0          0      13164          0      24681      14655      36264       6812          0      41162          0          0          0       1105          0        711      53743    1669879          0      72514          0          0          0        853          0       7183          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI 51904514-edge      eno1-TxRx-1
 228:    8595276          0       3390          0    1293157          0          0       8021          7    1689644          0     853834       9803          0    3141289          0          0          0        339          0          0    1527820      10156          0    3137750          0          0          0       2679          0       9783          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   45519649          0       8355          0    2862458          0          0       8751         11     270635       1967    1008413       7152          0    3922754          0          0          0          0          0       2634    7118787      12291          0    1856440          0          0          0        138          0       3169          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI 51904515-edge      eno1-TxRx-2
 229:          0          0      19991          0    1195216        812          0      24327          0    6518631          0    1217322      78370          0    6420069          0          0          0          0          0          0   10194227     337759          0    3374695          0          0          0      32163          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0    5497294          0       7467          0     477793          0          0      12223          0    2445081          0   41761718    1434752          0    1591891          0          0          0     180617          0          0    4848387     140226          0    2481320          0          0          0          0          0       5732          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI 51904516-edge      eno1-TxRx-3
 230:          0          0     258640          0     138578     815337          0    1302684          0     418622          8     272004     134173          0     610410          0          0          0      21494          0          0     552005      55938          0     756485          0          0          0        645          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0      77610          0      12155          0     112490          0          0      45203          0     204021          0     574545      45856          0     449836          0   22684381          0          0          0   39956935     819907      98211          0     626075          0          0          0       1661          0       3159          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI 51904517-edge      eno1-TxRx-4
 231:          0          0       7193          0    1112928       9224          0      92397          0     461768          0   11123588      52942          0    3207074          0          0          0       1605          0          0    3697806    1502278          0    4287738          0          0          0       3464          0        772          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0    1003157          0      67971          0     373156          0          0       3289          0    1470978       1381    1873520     118474          0    2581217          0          0          0        155          0       5594    8860813     325490          0   27589217          0          0          0          0          0       1508          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI 51904518-edge      eno1-TxRx-5
 232:          0          0       1351          0     543455        391          0      21013          0     269761          0     684262       8257          0    1534331          0          0          0    2518911          0          0    1876412      13776          0    1627170          0       1377          0       9261          0       7298          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   12120260          0       3143   56453633      35954          0          0       6164          0    1808253          0     701702      62365          0    1899029          0          0          0        189          0       5465    2319681     917657          0    1046856          0          0          0       3743          0       3741          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI 51904519-edge      eno1-TxRx-6
 233:          0          0       2949          0    1587065         89          0       5255          0    1796928          0    1962632       8953          7    6893264          0          0          0       1603          0          0    5698540     151787          0   22071018          0          0          0       1347          0       5294          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0      16594          0       9495          0     500265          0          0     272862          0     638106      16171   10892973      12784          0    4198715          0          0          0          0          0        821    6926534     193808          0    2916659    8371497          0          0        528          0      42622          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI 51904520-edge      eno1-TxRx-7&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;테스트 결과&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;RX: 221,000,000 p/s (3.03%)
&lt;ul style=&quot;list-style-type: circle;&quot; data-indent-level=&quot;2&quot; data-local-id=&quot;cc08da1f-ca4c-4ff0-adb9-5b3fe5805161&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;Drop: 6,690,000 p/s&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;TX: 216,000,000 p/s (1.6%)
&lt;ul style=&quot;list-style-type: circle;&quot; data-indent-level=&quot;2&quot; data-local-id=&quot;a6012903-a71e-4671-97b7-cf74c5f02abd&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;Drop: 3,400,000 p/s&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;3. Istio 설정&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;확인해보니 설정이 이상하게 되어있어서 3-way handshake가 과도하게 발생하였습니다. 그리고 IRQ affinity를 설정과 일치하도록 istio를 NUMA aware하게 배포하고 concurrency를 조정합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;DestinationRule&amp;nbsp;설정&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;loadBalancer.simple: (로그밸런싱 알고리즘)
&lt;ul style=&quot;list-style-type: disc;&quot; data-indent-level=&quot;3&quot; data-local-id=&quot;cbe2e6121fbe&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;UNSPECIFIED: No LB (default&lt;/li&gt;
&lt;li&gt;RANDOM: random healthy host. 헬스체크 정책이 없다면 RR보다 성능 좋음.&lt;/li&gt;
&lt;li&gt;PASSTHROUGH: Envoy는 by-pass하고 외부 LB 부하 분산 하겠다 인듯.&lt;/li&gt;
&lt;li&gt;ROUND_ROBIN: Basic and unsafe.&lt;/li&gt;
&lt;li&gt;LEAST_REQUEST: 패킷 처리가 지연되는 Pod에 패킷을 덜 보냄. 모든 케이스에서 RR보다 성능 좋음.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1772183862026&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;spec:
  host: 
  trafficPolicy:
    loadBalancer:
      simple: LEAST_REQUEST
    connectionPool:
      http:
        http1MaxPendingRequests: 256000
        http2MaxRequests: 0 
        maxRequestsPerConnection: 0
        idleTimeout: 300s
      tcp:
        connectTimeout: 10s 
        maxConnectionDuration: 600s
        maxConnections: 0
    tls:
      mode: SIMPLE 
      insecureSkipVerify: true&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;적용 확인&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1772183915547&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;kubectl exec &amp;lt;pod-name&amp;gt; -c istio-proxy -n &amp;lt;namespace&amp;gt; -- curl -s localhost:15000/config_dump | grep -A 50 &quot;&amp;lt;host_url&amp;gt;&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;Concurrency 설정&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;envoy가 몇개의 worker thread를 사용할지 concurrency로 설정합니다.&lt;/li&gt;
&lt;li&gt;envoy의 각 스레드는 connection data를 TLS(Thread Local Storage)라는 메모리 공간에 저장합니다.&lt;/li&gt;
&lt;li&gt;worker thread이 큰 상황에서 connection이 많아지면 TLS에서 사용하는 메모리가 과하게 많아집니다.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/envoyproxy/envoy/issues/38513&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;벤치마크&lt;/a&gt; 성능에서는 concurrecy를 낮게 설정합니다. 따라서 concurrency를 NIC queue개수와 맞추어 8로 설정했습니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1316&quot; data-origin-height=&quot;648&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cnhTOB/dJMcahXOl17/TZFlfLNoTQKCKKDo2f70J0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cnhTOB/dJMcahXOl17/TZFlfLNoTQKCKKDo2f70J0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cnhTOB/dJMcahXOl17/TZFlfLNoTQKCKKDo2f70J0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcnhTOB%2FdJMcahXOl17%2FTZFlfLNoTQKCKKDo2f70J0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;584&quot; height=&quot;288&quot; data-origin-width=&quot;1316&quot; data-origin-height=&quot;648&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;NIC 물리 queue 개수 확인&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1772512146998&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# ethtool -l eno1
Channel parameters for eno1:
Pre-set maximums:
RX:		n/a
TX:		n/a
Other:		1
Combined:	8
Current hardware settings:
RX:		n/a
TX:		n/a
Other:		1
Combined:	8&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;istio concurrency 설정 확인&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1772512194967&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# kubectl exec -n &amp;lt;namespace&amp;gt; &amp;lt;istio-pod&amp;gt; -c istio-proxy -- curl localhost:15000/server_info
&quot;concurrency&quot;: 128&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;NUMA aware 설정&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;우선 kubelet에 &lt;a href=&quot;https://kubernetes.io/docs/tasks/administer-cluster/topology-manager/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;topology manager&lt;/a&gt; 설정이 추가되어야 합니다.&amp;nbsp;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;OS와 kubelet, IRQ 처리 등을 위한 코어를 reserve합니다.&lt;/li&gt;
&lt;li&gt;Istio가 PCI에서 NIC과 인정하도록 배포합니다.&lt;/li&gt;
&lt;li&gt;실제 사용할 topology manager policy는&amp;nbsp; best-effort/restircted + prefer-closet-numa-nodes를 고려합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1772184760273&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# vim /etc/kubernetes/kubelet.env
...
--cpu-manager-policy=static \
--topology-manager-policy=single-numa-node \ # 단순 테스트용
--reserved-cpus=0,1,2,3,4,5,6,7,64,65,66,67,68,69,70,71,72&lt;/code&gt;&lt;/pre&gt;
&lt;pre id=&quot;code_1772184828859&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# systemctl stop kubelet
# rm /var/lib/kubelet/cpu_manager_state
# systemctl start kubelet&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;kubelet 설정 이후 istio의 QoS Class가 Quaranteed로 배포되면 .&lt;/li&gt;
&lt;li&gt;적용 확인
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;istio의 pid로 CPU 마스킹을 확인합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1772184864945&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# cat /proc/&amp;lt;pid&amp;gt;/status
...
Cpus_allowed:	00000000,000003c0,00000000,000003c0
Cpus_allowed_list:	13-16,77-80&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;테스트 결과&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;RX: 53,900,000 p/s (1.17%)
&lt;ul style=&quot;list-style-type: circle;&quot; data-indent-level=&quot;2&quot; data-local-id=&quot;487b5bcd-7c46-46a2-9671-1adbc397968d&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;Drop: 925,000 p/s&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;TX: 71,900,000 p/s (0.3%)
&lt;ul style=&quot;list-style-type: circle;&quot; data-indent-level=&quot;2&quot; data-local-id=&quot;99ecaaa0-5c59-480e-ab5a-fdcb84399eea&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;Drop: 227,000 p/s&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;4. Calico eBPF 설정&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;felixConfiguration에 &lt;a href=&quot;https://docs.tigera.io/calico/latest/about/kubernetes-training/about-ebpf&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;eBPF&lt;/a&gt;를 적용하면 iptables 대신 eBPF에서 패킷을 후킹하여 calico interface로 bypass합니다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1942&quot; data-origin-height=&quot;482&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bWAavt/dJMcadA2sID/pDmtlys4RvgmFIrTqyiq60/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bWAavt/dJMcadA2sID/pDmtlys4RvgmFIrTqyiq60/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bWAavt/dJMcadA2sID/pDmtlys4RvgmFIrTqyiq60/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbWAavt%2FdJMcadA2sID%2FpDmtlys4RvgmFIrTqyiq60%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1942&quot; height=&quot;482&quot; data-origin-width=&quot;1942&quot; data-origin-height=&quot;482&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;적용 방법&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1. calico-node pod의 upgrade-iam과 calico-node 컨테이너에 kube-apiserver에 대한 환경변수를 추가합니다.&lt;/p&gt;
&lt;pre id=&quot;code_1772183630611&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;- name: KUBERNETES_SERVICE_HOST
  value: &quot;&amp;lt;Master IP&amp;gt;&quot;
- name: KUBERNETES_SERVICE_PORT
  value: &quot;6443&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2. kube-proxy disable&lt;/p&gt;
&lt;pre id=&quot;code_1772183652183&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;kubectl patch ds -n kube-system kube-proxy -p '{&quot;spec&quot;:{&quot;template&quot;:{&quot;spec&quot;:{&quot;nodeSelector&quot;:{&quot;non-existent&quot;: &quot;true&quot;}}}}}'&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;3. felix bpf enable 후 노드 재부팅&lt;/p&gt;
&lt;pre id=&quot;code_1772183666858&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;kubectl patch felixconfiguration default --type merge --patch '{&quot;spec&quot;: {&quot;bpfEnabled&quot;: true}}'&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;4. 적용 확인&lt;/p&gt;
&lt;pre id=&quot;code_1772183793757&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# kubectl exec -it -n kube-system calico-node-xxxxx -- calico-node -bpf routes dump&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&amp;nbsp;&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;테스트 결과&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;RX: 91,500,000 p/s
&lt;ul style=&quot;list-style-type: circle;&quot; data-indent-level=&quot;2&quot; data-local-id=&quot;fb74bb98-d9cc-4ec4-9e1c-f796dcb0c91b&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;Drop: 0 p/s&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;TX: 313,000,000 p/s
&lt;ul style=&quot;list-style-type: circle;&quot; data-indent-level=&quot;2&quot; data-local-id=&quot;9031ab9d-d867-4ced-a844-f971a5e52062&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;Drop: 0 p/s&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>System Engineering/Network</category>
      <author>Hopulence</author>
      <guid isPermaLink="true">https://hopulence.tistory.com/55</guid>
      <comments>https://hopulence.tistory.com/55#entry55comment</comments>
      <pubDate>Thu, 19 Feb 2026 16:45:52 +0900</pubDate>
    </item>
    <item>
      <title>[Harbor] replication timeout</title>
      <link>https://hopulence.tistory.com/53</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;대역폭이 충분하지 않거나 레이어 용량이 큰 경우 replication이 timeout 발생합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;replicate execution log&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1758192871921&quot; class=&quot;go&quot; data-ke-language=&quot;go&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;[ERROR] [/controller/replication/transfer/image/transfer.go:396]: failed to pushing the blob sha256:, size *****: Put &quot;http://harbor-core:80/v2/*&quot;: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
[ERROR] [/controller/replication/transfer/image/transfer.go:195]: Put &quot;http://harbor-core:80/v2/*&quot;: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
[ERROR] [/controller/replication/transfer/image/transfer.go:201]: got error during the whole transfer period, mark the job failure&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;harbor-jobservice log&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1758119149868&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;[ERROR] [/jobservice/runner/redis.go:123]: Job 'REPLICATION:' exit with error: run error: got error during the whole transfer period, mark the job failure&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;default 30분으로 정의되어 있습니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1758119469529&quot; class=&quot;go&quot; data-ke-language=&quot;go&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;// harbor/src/pkg/registry/client.go
const (
	UserAgent = &quot;harbor-registry-client&quot;
	// DefaultHTTPClientTimeout is the default timeout for registry http client.
	DefaultHTTPClientTimeout = 30 * time.Minute
)&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;harbor-core와 harbor-jobservice에 REGISTRY_HTTP_CLIENT_TIMEOUT를 env로 선언해주면 overwirte할 수 있습니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1758119439129&quot; class=&quot;go&quot; data-ke-language=&quot;go&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;func init() {
	registryHTTPClientTimeout = DefaultHTTPClientTimeout
	// override it if read from environment variable, in minutes
	if env := os.Getenv(&quot;REGISTRY_HTTP_CLIENT_TIMEOUT&quot;); len(env) &amp;gt; 0 {
		timeout, err := strconv.ParseInt(env, 10, 64)
		if err != nil {
			log.Errorf(&quot;Failed to parse REGISTRY_HTTP_CLIENT_TIMEOUT: %v, use default value: %v&quot;, err, DefaultHTTPClientTimeout)
		} else {
			if timeout &amp;gt; 0 {
				registryHTTPClientTimeout = time.Duration(timeout) * time.Minute
			}
		}
	}
}&lt;/code&gt;&lt;/pre&gt;</description>
      <category>System Engineering/Harbor</category>
      <author>Hopulence</author>
      <guid isPermaLink="true">https://hopulence.tistory.com/53</guid>
      <comments>https://hopulence.tistory.com/53#entry53comment</comments>
      <pubDate>Wed, 17 Sep 2025 23:33:50 +0900</pubDate>
    </item>
    <item>
      <title>Terragrunt 기초 정리</title>
      <link>https://hopulence.tistory.com/52</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;1. Golssary&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;Terragrunt&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Terraform으로 작성된 IaC를 Orchcstation하기 위한 Tool.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Unit&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Terragrunt로 관리되는 단일 인프라 인스턴스로 hcl파일이 여기에 해당&lt;/li&gt;
&lt;li&gt;보통 하나의 VPC, DB, Server 등을 상징&lt;/li&gt;
&lt;li&gt;Root hcl과 Child hcl로 구분되며, K8s 기준 hcl hierarchy는 다음과 같다.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;root.hcl(base.bcl)&lt;/li&gt;
&lt;li&gt;Cluster 전체에 대한 kubernetes.hcl&lt;/li&gt;
&lt;li&gt;namespace에 대한&amp;nbsp; hcl&lt;/li&gt;
&lt;li&gt;CR 등 의존성이 필요한 모듈에 대한 hcl&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Stack&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Unit의 집합으로 종종 하나의 region, business unit, app environment를 상징&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Module&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;여러 리소스를 포함하는 .tf파일의 집합으로 Stack 내 하위 디렉터리로 구분&lt;/li&gt;
&lt;li&gt;재사용 가능한 config를 구성하는 것이 Module의 목적&lt;/li&gt;
&lt;li&gt;Root module, Child module, &lt;s&gt;Published module&lt;/s&gt;로 나뉨&lt;/li&gt;
&lt;li&gt;Inputs module
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;variables.tf 파일에 정의한 variable을 다른 module 내에서 ${var.var_name}으로 사용 가능&lt;/li&gt;
&lt;li&gt;module 내 정의되지 않으면 terragrunt 실행 시 사용자 cli로 입력 받음&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Outputs module
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;outputs.tf 파일에서 다른 module의 output variables를 return&lt;/li&gt;
&lt;li&gt;module.&amp;lt;Moudle_name&amp;gt;.&amp;lt;Output_name&amp;gt; 으로 접근 가능&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Local module
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;module 내에 선언하여 사용. 외부의 value에 의해 덮어써지지 않으며 최우선으로 적용.&lt;/li&gt;
&lt;li&gt;&amp;nbsp;local.&amp;lt;Name&amp;gt;으로 참조 가능.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1757414894402&quot; class=&quot;cpp&quot; data-ke-language=&quot;cpp&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;// Input module

// modules/services/webserver-cluster/variables.tf 
variable &quot;cluster_name&quot; {
  description = &quot;The name to use for all the cluster resources&quot;
  type        = string
}

// stage/services/webserver-cluster/main.tf
module &quot;webserver_cluster&quot; {
  source = &quot;../../../modules/services/webserver-cluster&quot;
  cluster_name           = &quot;webservers-prod&quot;
  db_remote_state_bucket = &quot;(YOUR_BUCKET_NAME)&quot;
  db_remote_state_key    = &quot;prod/data-stores/mysql/terraform.tfstate&quot;
}&lt;/code&gt;&lt;/pre&gt;
&lt;pre id=&quot;code_1757419286942&quot; class=&quot;cpp&quot; data-ke-language=&quot;cpp&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;// Output module
output &quot;asg_name&quot; {
  value       = aws_autoscaling_group.example.name
  description = &quot;The name of the Auto Scaling Group&quot;
}&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;Resource&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Module 내 정의되는 인프라의 가장 작은 단위&lt;/li&gt;
&lt;li&gt;Helm 또는 k8s configmap, deployment 등의 컴포넌트에 해당&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;b&gt;State&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Terragrunt는 IaC로 배포된 형상을 state file로 관리&lt;/li&gt;
&lt;li&gt;state 파일에 대한 정의는 base.hcl에 명세&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Dependency&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;.tf파일 내 Dependency를 선언&lt;/li&gt;
&lt;li&gt;Resouce의 의존성에 따라 Module을 Group으로 나누어 배포
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;예를 들어, minio를 스토리지로 사용하는 loki의 경우 minio가 배포된 이후 배포되어야 함.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1756551356971&quot; class=&quot;nginx&quot; style=&quot;background-color: #f8f8f8; color: #383a42; text-align: start;&quot; data-ke-type=&quot;codeblock&quot; data-ke-language=&quot;bash&quot;&gt;&lt;code&gt;# path/to/loki/terragrunt.hcl

include &quot;base&quot; {
  path = find_in_parent_folders(&quot;base.hcl&quot;)
}

include &quot;kubernetes&quot; {
  path = find_in_parent_folders(&quot;kubernetes.hcl&quot;)
}

dependency &quot;minio&quot; {
  config_path = &quot;../minio&quot;
  
...&lt;/code&gt;&lt;/pre&gt;
&lt;pre id=&quot;code_1756551263490&quot; class=&quot;routeros&quot; style=&quot;background-color: #f8f8f8; color: #383a42; text-align: start;&quot; data-ke-type=&quot;codeblock&quot; data-ke-language=&quot;bash&quot;&gt;&lt;code&gt;# terragrunt run-all plan

Group 1
- Module ./minio

Group 2
- Module ./loki
...&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;Quick Start&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;(WIP)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;CLI
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;terragrunt plan
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;현재 디렉터리 기준으로 state와 stack에서 변경 사항을 diff하여 보여줌&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;terragrunt apply
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;stack의 변경 사항 적용&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;terragrunt run-all plan &amp;amp; apply
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;경로와 상관없이 모든 리소스, 모듈 대상&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;Reference&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;https://www.gruntwork.io/blog/how-to-create-reusable-infrastructure-with-terraform-modules&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://www.gruntwork.io/blog/how-to-create-reusable-infrastructure-with-terraform-modules&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;figure id=&quot;og_1757414542934&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;Gruntwork Blog | How to create reusable infrastructure with Terraform modules&quot; data-og-description=&quot;Update, November 17, 2016: We took this blog post series, expanded it, and turned it into a book called Terraform: Up &amp;amp; Running!&quot; data-og-host=&quot;www.gruntwork.io&quot; data-og-source-url=&quot;https://www.gruntwork.io/blog/how-to-create-reusable-infrastructure-with-terraform-modules&quot; data-og-url=&quot;https://www.gruntwork.io/blog/how-to-create-reusable-infrastructure-with-terraform-modules&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/na7Dq/hyZIQv5JI5/dUDTzkxVjRhqkzTKvUKZc1/img.jpg?width=3000&amp;amp;height=2000&amp;amp;face=0_0_3000_2000,https://scrap.kakaocdn.net/dn/Pab2b/hyZIPqoDVX/zYOYOhb2myFjB06Wxz8qy0/img.jpg?width=3000&amp;amp;height=2000&amp;amp;face=0_0_3000_2000,https://scrap.kakaocdn.net/dn/bAFT1b/hyZI0MfcFV/B27R9qU1sQCg8wCwjqPURk/img.jpg?width=3000&amp;amp;height=2000&amp;amp;face=0_0_3000_2000&quot;&gt;&lt;a href=&quot;https://www.gruntwork.io/blog/how-to-create-reusable-infrastructure-with-terraform-modules&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://www.gruntwork.io/blog/how-to-create-reusable-infrastructure-with-terraform-modules&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/na7Dq/hyZIQv5JI5/dUDTzkxVjRhqkzTKvUKZc1/img.jpg?width=3000&amp;amp;height=2000&amp;amp;face=0_0_3000_2000,https://scrap.kakaocdn.net/dn/Pab2b/hyZIPqoDVX/zYOYOhb2myFjB06Wxz8qy0/img.jpg?width=3000&amp;amp;height=2000&amp;amp;face=0_0_3000_2000,https://scrap.kakaocdn.net/dn/bAFT1b/hyZI0MfcFV/B27R9qU1sQCg8wCwjqPURk/img.jpg?width=3000&amp;amp;height=2000&amp;amp;face=0_0_3000_2000');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Gruntwork Blog | How to create reusable infrastructure with Terraform modules&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;Update, November 17, 2016: We took this blog post series, expanded it, and turned it into a book called Terraform: Up &amp;amp; Running!&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;www.gruntwork.io&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>System Engineering/Terraform</category>
      <author>Hopulence</author>
      <guid isPermaLink="true">https://hopulence.tistory.com/52</guid>
      <comments>https://hopulence.tistory.com/52#entry52comment</comments>
      <pubDate>Sat, 30 Aug 2025 20:28:52 +0900</pubDate>
    </item>
    <item>
      <title>[Ceph] mon timecheck 동작 방식과 MON_CLOCK_SKEW</title>
      <link>https://hopulence.tistory.com/51</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;Ceph 모니터링을 하면서 NTP 데몬을 timesyncd에서 chrony로 변경해주어도 MON_CLOCK_SKEW 알람이 발생하여 원인을 파악하기 위해 기록한 글입니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h4 style=&quot;text-align: start;&quot; data-ke-size=&quot;size20&quot;&gt;Monitor의 역할&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Ceph mon은 Cluster Map을 유지
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Cluster map은 모든 mon, osd, mds의 위치를 결정합니다.&lt;/li&gt;
&lt;li&gt;csi를 사용하는 client는 osd 또는 mds로 read/write를 하기 전에 반드시 mon을 통해서 현재 cluster map의 정보를 얻어야 합니다.&lt;/li&gt;
&lt;li&gt;cluster map을 갱신한 client는 CRUSH 계산으로 object의 위치를 알아내어 osd로 직접 통신할 수 있습니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Authentication과 Logging 제공합니다.&lt;/li&gt;
&lt;li&gt;Cluster의 모든 변화는 단일 Paxos 인스턴스에 쓰여지며, 이는 RocksDB에 key/value로 저장됩니다.&lt;/li&gt;
&lt;li&gt;mon들은 최신 버전의 cluster map을 동기화하고 RocksDB의 스냅샷을 뜹니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;* Paxos란?&lt;/p&gt;
&lt;p style=&quot;text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp; &amp;nbsp;Ceph의 consensus 프로토콜&lt;/p&gt;
&lt;p style=&quot;text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Cluster Map
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;mon, osd, pg(placement group), mds(metadata server)의 map을 포함합니다.&lt;/li&gt;
&lt;li&gt;스토리지 용량과 각 컴포넌트의 상태 정보 또한 포함합니다.&lt;/li&gt;
&lt;li&gt;그리고 이 정보는 epoch로 관리됩니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;910&quot; data-origin-height=&quot;750&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bNnxnV/btsQAYRVsmn/DKSGbsolMttVCAH15kAFGk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bNnxnV/btsQAYRVsmn/DKSGbsolMttVCAH15kAFGk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bNnxnV/btsQAYRVsmn/DKSGbsolMttVCAH15kAFGk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbNnxnV%2FbtsQAYRVsmn%2FDKSGbsolMttVCAH15kAFGk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;446&quot; height=&quot;368&quot; data-origin-width=&quot;910&quot; data-origin-height=&quot;750&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h4 style=&quot;text-align: start;&quot; data-ke-size=&quot;size20&quot;&gt;Monitor synchronization&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;각 mon은 cluster map의 epoch를 주기적으로 확인하다가, 버전이 뒤쳐지면 quorum에서 빠진 후 최신 정보를 동기화한 뒤 다시 클러스터에 조인합니다.&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Leader: 클러스터 맵의 최신 Paxos 버전을 아카이브하는 mon&lt;/li&gt;
&lt;li&gt;Provider: 최신 cluster map가지고 있지만, 최초로 아카아브하지 않는 mon&lt;/li&gt;
&lt;li&gt;Requester: 버전 정보가 뒤쳐져서 동기화와 rejoin이 필요한 mon&lt;/li&gt;
&lt;/ul&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;958&quot; data-origin-height=&quot;1120&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bBq85Y/btsQyuqxu3N/yxmOosjV6bSKt3QrH4o5k0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bBq85Y/btsQyuqxu3N/yxmOosjV6bSKt3QrH4o5k0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bBq85Y/btsQyuqxu3N/yxmOosjV6bSKt3QrH4o5k0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbBq85Y%2FbtsQyuqxu3N%2FyxmOosjV6bSKt3QrH4o5k0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;368&quot; height=&quot;430&quot; data-origin-width=&quot;958&quot; data-origin-height=&quot;1120&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&amp;nbsp;&lt;/h4&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;1. MON_CLOCK_SKEW 알람&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Prometheus는 MON_CLOCK_SKEW 알람이 1분이상 유지되는 경우 alert을 발생시킵니다.&lt;/li&gt;
&lt;li&gt;Ceph은 mon_timecheck_interval(300s) 주기로 시간을 체크합니다.&lt;/li&gt;
&lt;li&gt;이 과정에서 mon_clock_drift_allowed에 default로 정의된 0.05초 이상의 오차가 발생한 경우에 이를 감지합니다.&lt;/li&gt;
&lt;li&gt;'skew'가 발생하면 &lt;span style=&quot;text-align: start;&quot;&gt;mon_timecheck_skew_interval(30s)마다 다시 확인합니다.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1755942377754&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;      - alert: CephMonClockSkew
        annotations:
          description: Ceph monitors rely on closely synchronized time to maintain quorum
            and cluster consistency. This event indicates that the time on at least one
            mon has drifted too far from the lead mon. Review cluster status with ceph
            -s. This will show which monitors are affected. Check the time sync status
            on each monitor host with 'ceph time-sync-status' and the state and peers
            of your ntpd or chrony daemon.
          documentation: https://docs.ceph.com/en/latest/rados/operations/health-checks#mon-clock-skew
          summary: Clock skew detected among monitors
        expr: ceph_health_detail{name=&quot;MON_CLOCK_SKEW&quot;} == 1
        for: 1m
        labels:
          severity: warning
          type: ceph_default&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;2. Ceph의 healtcheck 방식&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Leader mon에 의해 트리거 된 ping에 peon mon이 현재 시간을 포함한 pong을 응답하여 skew를 감지합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1755942482021&quot; class=&quot;cpp&quot; data-ke-language=&quot;cpp&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;// ceph/src/mon/Monitor.h
 /**
   * @defgroup Monitor_h_TimeCheck Monitor Clock Drift Early Warning System
   * @{
   *  ...
   *  - Leader sends out a 'PING' message to each other monitor in the quorum.
   *    The message is timestamped with the leader's current time. The leader's
   *    current time is recorded in a map, associated with each peon's
   *    instance.
   *  - The peon replies to the leader with a timestamped 'PONG' message.
   *  - The leader calculates a delta between the peon's timestamp and its
   *    current time and stashes it.
   *  - The leader also calculates the time it took to receive the 'PONG'
   *    since the 'PING' was sent, and stashes an approximate latency estimate.
   *  - Once all the quorum members have pong'ed, the leader will share the
   *    clock skew and latency maps with all the monitors in the quorum.
   */&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;최초 quorum 구성 후 election에서 이겨 leader가 되면 timecheck()을 호출합니다. 이후 timecheck_start()를 호출하여&amp;nbsp;&lt;/li&gt;
&lt;li&gt;timecheck()에서 leader mon은 현재 시간을 기록하고 MTimeCheck2 클래스에 epoch와 round 정보를 포함하여 peon에게 메시지를 보냅니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1757311125251&quot; class=&quot;cpp&quot; data-ke-language=&quot;cpp&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;// ceph/src/mon/Monitor.cc
void Monitor::timecheck() {
  ...

  for (set&amp;lt;int&amp;gt;::iterator it = quorum.begin(); it != quorum.end(); ++it) {
    if (monmap-&amp;gt;get_name(*it) == name)
      continue;

    utime_t curr_time = ceph_clock_now();
    timecheck_waiting[*it] = curr_time;
    MTimeCheck2 *m = new MTimeCheck2(MTimeCheck2::OP_PING);
    m-&amp;gt;epoch = get_epoch();
    m-&amp;gt;round = timecheck_round;
    dout(10) &amp;lt;&amp;lt; __func__ &amp;lt;&amp;lt; &quot; send &quot; &amp;lt;&amp;lt; *m &amp;lt;&amp;lt; &quot; to mon.&quot; &amp;lt;&amp;lt; *it &amp;lt;&amp;lt; dendl;
    send_mon_message(m, *it);
    }
 }&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;MTimeCheck2 클래스 내에는 timestamp가 선언되어 있으며, client는 pong 메세지에 시간을 포함하여 보냅니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1757312831137&quot; class=&quot;cpp&quot; data-ke-language=&quot;cpp&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;// ceph/src/messages/MTimeCheck2.h

class MTimeCheck2 final : public Message {
...
  enum {
    OP_PING = 1,
    OP_PONG = 2,
    OP_REPORT = 3,
  };

  int op = 0;
  version_t epoch = 0;
  version_t round = 0;

  utime_t timestamp;
  std::map&amp;lt;int, double&amp;gt; skews;
  std::map&amp;lt;int, double&amp;gt; latencies;
...&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Ping을 받은 client는 ceph_clock_now() 함수를 호출하여 pong으로 응답합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1757329712211&quot; class=&quot;cpp&quot; data-ke-language=&quot;cpp&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;void Monitor::handle_timecheck_peon(MonOpRequestRef op) {
   ...

  ceph_assert((timecheck_round % 2) != 0);
  MTimeCheck2 *reply = new MTimeCheck2(MTimeCheck2::OP_PONG);
  utime_t curr_time = ceph_clock_now();
  reply-&amp;gt;timestamp = curr_time;
  reply-&amp;gt;epoch = m-&amp;gt;epoch;
  reply-&amp;gt;round = m-&amp;gt;round;
  dout(10) &amp;lt;&amp;lt; __func__ &amp;lt;&amp;lt; &quot; send &quot; &amp;lt;&amp;lt; *m
           &amp;lt;&amp;lt; &quot; to &quot; &amp;lt;&amp;lt; m-&amp;gt;get_source_inst() &amp;lt;&amp;lt; dendl;
  m-&amp;gt;get_connection()-&amp;gt;send_message(reply);
}&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;Ceph mon은 현재 시간을 ceph_clock_now() -&amp;gt; clock_gettime() 함수로 시스템 시간을 받아옵니다.&lt;/li&gt;
&lt;li&gt;CLOCK_REALTIME
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;span style=&quot;font-family: -apple-system, BlinkMacSystemFont, 'Helvetica Neue', 'Apple SD Gothic Neo', Arial, sans-serif; letter-spacing: 0px;&quot;&gt;adjtime(), adjtimex() 등 NTP에 의해 수정가능한 시간을 의미합니다.&lt;/span&gt;&lt;span style=&quot;font-family: -apple-system, BlinkMacSystemFont, 'Helvetica Neue', 'Apple SD Gothic Neo', Arial, sans-serif; letter-spacing: 0px;&quot;&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: -apple-system, BlinkMacSystemFont, 'Helvetica Neue', 'Apple SD Gothic Neo', Arial, sans-serif; letter-spacing: 0px;&quot;&gt;syscall에서 gettimeofday()는 obsolete이고 clock_gettime() 사용을 권고하지만, ceph은 빠른 처리를 위해 gettimeofday()를 사용합니다.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1755943899278&quot; class=&quot;cpp&quot; data-ke-language=&quot;cpp&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;static inline utime_t ceph_clock_now() {
#if defined(__linux__)
  struct timespec tp;
  clock_gettime(CLOCK_REALTIME, &amp;amp;tp);
  utime_t n(tp);
#else
  struct timeval tv;
  gettimeofday(&amp;amp;tv, nullptr);
  utime_t n(&amp;amp;tv);
#endif
  return n;
}&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Pong으로 client의 timestamp를 받은 leader는 skew를 계산합니다.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span style=&quot;text-align: left;&quot;&gt;먼저&lt;span&gt; latency(&lt;u&gt;leader의 pong을 응답받고 함수가 트리거 된 현재 시간&lt;/u&gt;과 &lt;i&gt;&lt;u&gt;ping을 보낸 시간에 대한 차이&lt;/u&gt;&lt;/i&gt;)를 구합니다.&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;그리고 leader와 client의 시간 차이인 delta를 구하고, 이를 절대값으로 만듭니다.&lt;/li&gt;
&lt;li&gt;여기서 |delta| - latency의 값이 skew의 값이 됩니다.&amp;nbsp;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;이후 timecheck_status()를 호출해서 skew 값을 config 값인 clock_drift_allowed와 비교합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1757330011023&quot; class=&quot;cpp&quot; data-ke-language=&quot;cpp&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;// ceph/src/mon/Monitor.cc
void Monitor::handle_timecheck_leader(MonOpRequestRef op) {
...
  utime_t curr_time = ceph_clock_now();
  
  double latency = (double)(curr_time - timecheck_sent);
  
 ...
 
   double delta = ((double) m-&amp;gt;timestamp) - ((double) curr_time);
  double abs_delta = (delta &amp;gt; 0 ? delta : -delta);
  double skew_bound = abs_delta - latency;
 ...

  health_status_t status = timecheck_status(ss, skew_bound, latency);&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;skew의 절대값이 clock_drift_allowed보다 크다면 HEALTH_WARN을 트리거합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1757345585431&quot; class=&quot;cpp&quot; data-ke-language=&quot;cpp&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;health_status_t Monitor::timecheck_status(ostringstream &amp;amp;ss, const double skew_bound, const double latency) {
...
  double abs_skew;
  if (timecheck_has_skew(skew_bound, &amp;amp;abs_skew)) {
    status = HEALTH_WARN;
...&lt;/code&gt;&lt;/pre&gt;
&lt;pre id=&quot;code_1757329760142&quot; class=&quot;cpp&quot; style=&quot;background-color: #f8f8f8; color: #383a42; text-align: start;&quot; data-ke-type=&quot;codeblock&quot; data-ke-language=&quot;cpp&quot;&gt;&lt;code&gt;  // ceph/src/mon/Monitor.h
  
  bool timecheck_has_skew(const double skew_bound, double *abs) const {
    double abs_skew = std::fabs(skew_bound);
    if (abs)
      *abs = abs_skew;
    return (abs_skew &amp;gt; g_conf()-&amp;gt;mon_clock_drift_allowed);
  }&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;mon leader는 pong 수신과는 비동기적으로 timecheck 과정을 반복합니다.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;found_skew가 true라면 timecheck_rounds_since_clean 값을 증가시킵니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1757329597037&quot; class=&quot;cpp&quot; data-ke-language=&quot;cpp&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;void Monitor::timecheck_check_skews()
{
...
  bool found_skew = false;
  for (auto&amp;amp; p : timecheck_skews) {
    double abs_skew;
    if (timecheck_has_skew(p.second, &amp;amp;abs_skew)) {
      dout(10) &amp;lt;&amp;lt; __func__
               &amp;lt;&amp;lt; &quot; &quot; &amp;lt;&amp;lt; p.first &amp;lt;&amp;lt; &quot; skew &quot; &amp;lt;&amp;lt; abs_skew &amp;lt;&amp;lt; dendl;
      found_skew = true;
    }
  }

  if (found_skew) {
    ++timecheck_rounds_since_clean;
    timecheck_reset_event();
    ...&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;found_skew가 false라면 mon_timecheck_interval(300s)에 정의된 타이머 이후 timecheck을 실행합니다.&lt;/li&gt;
&lt;li&gt;found_skew가 ture면 mon_timecheck_skew_interval(30s)주기에서 배수로 timecheck을 실행합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1757329408888&quot; class=&quot;cpp&quot; data-ke-language=&quot;cpp&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;// ceph/src/mon/Monitor.cc
void Monitor::timecheck_reset_event() {
...
  double delay =
    cct-&amp;gt;_conf-&amp;gt;mon_timecheck_skew_interval * timecheck_rounds_since_clean;

  if (delay &amp;lt;= 0 || delay &amp;gt; cct-&amp;gt;_conf-&amp;gt;mon_timecheck_interval) {
    delay = cct-&amp;gt;_conf-&amp;gt;mon_timecheck_interval;
  }
...

  timecheck_event = timer.add_event_after(
    delay,
    new C_MonContext{this, [this](int) {
	timecheck_start_round();
      }});
}&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;전체 flow를 그려보면 이렇습니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;3884&quot; data-origin-height=&quot;2766&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/tLJby/btsQoI22yhC/h2bmkwoUnKincPOVakbwc1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/tLJby/btsQoI22yhC/h2bmkwoUnKincPOVakbwc1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/tLJby/btsQoI22yhC/h2bmkwoUnKincPOVakbwc1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FtLJby%2FbtsQoI22yhC%2Fh2bmkwoUnKincPOVakbwc1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;3884&quot; height=&quot;2766&quot; data-origin-width=&quot;3884&quot; data-origin-height=&quot;2766&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;3. Skew 값의 의미&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&amp;nbsp;오차도 작고 latecy도 작은 경우
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;skew는 50ms보다 작으므로 알람이 발생하지 않음.&amp;nbsp;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;ctext-align: start;&quot;&gt;&lt;b&gt;&lt;span&gt;실제 오차&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;8ms&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style=&quot;; text-align: start;&quot;&gt;&lt;span&gt;&lt;b&gt;Latency&lt;/b&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;= 31ms - 1ms = 30ms&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style=&quot;; text-align: start;&quot;&gt;&lt;span&gt;&lt;b&gt;Delta&lt;/b&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;=&lt;/span&gt;&lt;span&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;19ms &amp;ndash; 31ms = -12ms&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style=&quot;text-align: start;&quot;&gt;&lt;span&gt;&lt;b&gt;Skew&lt;/b&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;= 12ms &amp;ndash; 30ms = -18ms&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;text-align: start;&quot;&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1948&quot; data-origin-height=&quot;1618&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/zuel1/btsQq172N7b/W7zgHIkyZO0Hrg3kqdSGL0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/zuel1/btsQq172N7b/W7zgHIkyZO0Hrg3kqdSGL0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/zuel1/btsQq172N7b/W7zgHIkyZO0Hrg3kqdSGL0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fzuel1%2FbtsQq172N7b%2FW7zgHIkyZO0Hrg3kqdSGL0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;455&quot; height=&quot;378&quot; data-origin-width=&quot;1948&quot; data-origin-height=&quot;1618&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;오차가 큰 경우 (skew 양수)
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;&lt;span&gt;실제 오차&amp;nbsp;&lt;/span&gt;&lt;/b&gt;&lt;span&gt;=&amp;nbsp;100&lt;/span&gt;&lt;span&gt;ms&lt;/span&gt;&lt;span&gt;&lt;br /&gt;&lt;/span&gt;&lt;b&gt;&lt;span&gt;Latency&lt;/span&gt;&lt;/b&gt;&lt;span&gt;&amp;nbsp;= 31ms - 1ms = 30ms&lt;/span&gt;&lt;span&gt;&lt;br /&gt;&lt;/span&gt;&lt;b&gt;&lt;span&gt;Delta&lt;/span&gt;&lt;/b&gt;&lt;span&gt;&amp;nbsp;= &lt;/span&gt;&lt;span&gt;1&lt;/span&gt;&lt;span&gt;1&lt;/span&gt;&lt;span&gt;1&lt;/span&gt;&lt;span&gt;ms&lt;/span&gt;&lt;span&gt; - 31ms = &lt;/span&gt;&lt;span&gt;8&lt;/span&gt;&lt;span&gt;0ms&lt;/span&gt;&lt;span&gt;&lt;br /&gt;&lt;/span&gt;&lt;b&gt;&lt;span&gt;Skew&lt;/span&gt;&lt;/b&gt;&lt;span&gt;&amp;nbsp;= &lt;/span&gt;&lt;span&gt;8&lt;/span&gt;&lt;span&gt;0ms &amp;ndash; 30ms =&amp;nbsp;&lt;/span&gt;&lt;span&gt;5&lt;/span&gt;&lt;span&gt;0ms&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;516&quot; data-origin-height=&quot;434&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/pCJjG/btsQpw8VYLR/kJRFj6xHgvZqGDUEZ2B08k/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/pCJjG/btsQpw8VYLR/kJRFj6xHgvZqGDUEZ2B08k/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/pCJjG/btsQpw8VYLR/kJRFj6xHgvZqGDUEZ2B08k/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FpCJjG%2FbtsQpw8VYLR%2FkJRFj6xHgvZqGDUEZ2B08k%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;516&quot; height=&quot;434&quot; data-origin-width=&quot;516&quot; data-origin-height=&quot;434&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Latency가 큰 경우 (skew 음수)
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span&gt;&lt;b&gt;실제 오차&lt;/b&gt;&amp;nbsp;&lt;/span&gt;&lt;span&gt;=&amp;nbsp;8&lt;/span&gt;&lt;span&gt;ms&lt;/span&gt;&lt;span&gt;&lt;br /&gt;&lt;/span&gt;&lt;b&gt;&lt;span&gt;Latency&lt;/span&gt;&lt;/b&gt;&lt;span&gt;&amp;nbsp;= &lt;/span&gt;&lt;span&gt;1&lt;/span&gt;&lt;span&gt;0&lt;/span&gt;&lt;span&gt;0&lt;/span&gt;&lt;span&gt;ms&lt;br /&gt;&lt;span&gt;Delta&lt;/span&gt;&lt;span&gt;&amp;nbsp;=&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;49ms&lt;/span&gt;&lt;span&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;-&lt;/span&gt;&lt;span&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;101ms = -52ms&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;span&gt;Skew&lt;/span&gt;&lt;span&gt;&amp;nbsp;= &lt;/span&gt;&lt;span&gt;5&lt;/span&gt;&lt;span&gt;2&lt;/span&gt;&lt;span&gt; &amp;ndash; 100 = -48&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;523&quot; data-origin-height=&quot;451&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/x3B2H/btsQrI8i44z/vmQkbYxlKUgEbDSbnIfeD1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/x3B2H/btsQrI8i44z/vmQkbYxlKUgEbDSbnIfeD1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/x3B2H/btsQrI8i44z/vmQkbYxlKUgEbDSbnIfeD1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fx3B2H%2FbtsQrI8i44z%2FvmQkbYxlKUgEbDSbnIfeD1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;523&quot; height=&quot;451&quot; data-origin-width=&quot;523&quot; data-origin-height=&quot;451&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;동일한 오차와 Latency라도 송/수신 지연 값에 따라 차이가 날 수도 있습니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;table style=&quot;border-collapse: collapse; width: 100%; height: 380px;&quot; border=&quot;1&quot; data-ke-align=&quot;alignLeft&quot;&gt;
&lt;tbody&gt;
&lt;tr style=&quot;height: 380px;&quot;&gt;
&lt;td style=&quot;width: 50%; height: 380px;&quot;&gt;수신 지연이 더 큰 경우&lt;br /&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;530&quot; data-origin-height=&quot;439&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/pgmmj/btsQpAXD63N/dXsuHHjvLl8hPi5sUW3xdK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/pgmmj/btsQpAXD63N/dXsuHHjvLl8hPi5sUW3xdK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/pgmmj/btsQpAXD63N/dXsuHHjvLl8hPi5sUW3xdK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fpgmmj%2FbtsQpAXD63N%2FdXsuHHjvLl8hPi5sUW3xdK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;324&quot; height=&quot;268&quot; data-origin-width=&quot;530&quot; data-origin-height=&quot;439&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;span&gt;&lt;br /&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;&lt;br /&gt;실제&lt;/b&gt; &lt;b&gt;오차 &lt;/b&gt;&lt;/span&gt;&lt;span&gt;=&amp;nbsp;90&lt;/span&gt;&lt;span&gt;ms&lt;/span&gt;&lt;span&gt;&lt;br /&gt;&lt;/span&gt;&lt;b&gt;&lt;span&gt;Latency&lt;/span&gt;&lt;/b&gt;&lt;span&gt;&amp;nbsp;= 31ms - 1ms = 30ms&lt;/span&gt;&lt;span&gt;&lt;br /&gt;&lt;/span&gt;&lt;b&gt;&lt;span&gt;Delta&lt;/span&gt;&lt;/b&gt;&lt;span&gt;&amp;nbsp;= &lt;/span&gt;&lt;span&gt;1&lt;/span&gt;&lt;span&gt;01&lt;/span&gt;&lt;span&gt;ms&lt;/span&gt;&lt;span&gt; - 31ms = &lt;/span&gt;&lt;span&gt;7&lt;/span&gt;&lt;span&gt;0ms&lt;/span&gt;&lt;span&gt;&lt;br /&gt;&lt;/span&gt;&lt;b&gt;&lt;span&gt;Skew&lt;/span&gt;&lt;/b&gt;&lt;span&gt;&amp;nbsp;= &lt;/span&gt;&lt;span&gt;7&lt;/span&gt;&lt;span&gt;0ms &amp;ndash; 30ms =&amp;nbsp;&lt;/span&gt;&lt;span&gt;4&lt;/span&gt;&lt;span&gt;0ms&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td style=&quot;width: 50%; height: 380px;&quot;&gt;&lt;br /&gt;송신 지연이 더 큰 경우&lt;br /&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;930&quot; data-origin-height=&quot;798&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/ctR2Oo/btsQCsYu7gI/pwREuC9pylGm42bBhcb2MK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/ctR2Oo/btsQCsYu7gI/pwREuC9pylGm42bBhcb2MK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/ctR2Oo/btsQCsYu7gI/pwREuC9pylGm42bBhcb2MK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FctR2Oo%2FbtsQCsYu7gI%2FpwREuC9pylGm42bBhcb2MK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;326&quot; height=&quot;280&quot; data-origin-width=&quot;930&quot; data-origin-height=&quot;798&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style=&quot;text-align: start;&quot;&gt;&lt;b&gt;실제&lt;/b&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;b&gt;오차&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;=&amp;nbsp;90ms&lt;span style=&quot;text-align: start;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;b&gt;&lt;span&gt;Latency&lt;/span&gt;&lt;/b&gt;&amp;nbsp;= 31ms - 1ms = 30ms&lt;span style=&quot;text-align: start;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;b&gt;&lt;span&gt;Delta&lt;/span&gt;&lt;/b&gt;&lt;span style=&quot;text-align: start;&quot;&gt;&amp;nbsp;=&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;111ms&lt;span style=&quot;text-align: start;&quot;&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;- 31ms =&lt;span&gt; 8&lt;/span&gt;&lt;/span&gt;0ms&lt;span style=&quot;text-align: start;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;b&gt;&lt;span&gt;Skew&lt;/span&gt;&lt;/b&gt;&lt;span style=&quot;text-align: start;&quot;&gt;&amp;nbsp;=&lt;span&gt; 8&lt;/span&gt;&lt;/span&gt;0ms &amp;ndash; 30ms =&amp;nbsp;40ms&lt;br /&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&amp;nbsp;mon의 skew는 단순한 시간 오차가 아닌, queueing과 dispatch 시간을 포함하는 RTT까지 고려한 worst case의 threshold입니다.&lt;br /&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span style=&quot;font-family: -apple-system, BlinkMacSystemFont, 'Helvetica Neue', 'Apple SD Gothic Neo', Arial, sans-serif; letter-spacing: 0px;&quot;&gt;Skew = | delta | - latency &lt;/span&gt;= | 시간 오차 + 수신 지연 | - RTT(dispatch + queueing 포함)&lt;/li&gt;
&lt;li&gt;skew가 &lt;u&gt;양수인 경우 시간 오&lt;/u&gt;차를 ,&lt;u&gt;음수인 경우는 high latnecy&lt;/u&gt;를 의미합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;NTP source와 ceph 클러스터가 물리적으로 같은 zone에 존재하는 경우 L3 RTT(e.g ping result)는 보통 1ms를 넘지 않으며, I/O 대기와 같은 특수한 상황이 아니라면 dispatch와 queueing 시간을 포함하는 TCP 계층에서의 RTT도 유사합니다. (네트워크 혼잡이 없는 상황에서)
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;따라서 일반적인 상황에서는 skew는 NTP offset의 오차로 인한 영향이지 않을까 합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p style=&quot;text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;결론&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Skew가 발생했다고 하여 반드시 NTP에 문제가 있는 것은 아닙니다.&lt;/li&gt;
&lt;li&gt;Ceph에서는 NTP offset이 정상임에도 알람이 발생할 수 있으니 workload와 network를 고려해서 montior synchronization 과정에 Paxos에 영향이 없도록 조정을 권고합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Clock drift may still be noticeable with NTP even though the discrepancy is not yet harmful. Ceph's clock drift / clock skew warnings may get triggered even though NTP maintains a reasonable level of synchronization. Increasing your clock drift may be tolerable under such circumstances; however, a number of factors such as workload, network latency, configuring overrides to default timeouts and the Monitor Store Synchronization settings&amp;nbsp;may&amp;nbsp;influence&amp;nbsp;the&amp;nbsp;level&amp;nbsp;of&amp;nbsp;acceptable&amp;nbsp;clock&amp;nbsp;drift&amp;nbsp;without&amp;nbsp;compromising&amp;nbsp;Paxos&amp;nbsp;guarantees.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;skew 값을 얼마까지 늘려도 될까?&amp;nbsp;&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;mon lease timeout&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;선출된 leader는 lease를 발급하고 주기적으로 갱신하여 quorum의 멤버들이 최신 cluster map을 가지고 있다고 판단합니다.&lt;/li&gt;
&lt;li&gt;leader는 peon에게 보낸 lease renew에 대해 만료 시간이전에 ack를 받지 못하면 bootstrap 과정을 수행하여 리더 선출 과정을 새로 진행합니다.&lt;/li&gt;
&lt;li&gt;peon 또한 ack를 보낸 이후 다음 주기 지나도 renew 메세지가 도착하지 않으면, leader가 죽은것으로 간주하여 리더 선출 과정을 진행합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;leader는 &lt;b&gt;만료 시간을 mon_lease(5초)만큼 연장&lt;/b&gt;하여 peon에게 &lt;b&gt;renw interval(3초)마다 메세지를 보내&lt;/b&gt;고 &lt;b&gt;ack_timeout(10초)동안 ack를 기다&lt;/b&gt;립니다.&lt;/li&gt;
&lt;li&gt;그리고 peon은 ack를 보낸 뒤 &lt;b&gt;ack_timeout(10초)&lt;/b&gt;동안 다음 주기를 기다립니다.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;mon_lease (default 5.0)&lt;/li&gt;
&lt;li&gt;mon_leaser_ack_timeout_factor (default 2.0)&lt;br /&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;leader가 peon으로부터 ack를 기다리는 인자입니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;mon_lease_renew_interval_factor (default 0.6)
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;leader가 peon에게 extend_lease 메세지를 보내는 주기에 대한 인자입니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;mon_accpet_timeout_factor
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Paxos recovery 단계에서 Leader가 Requester의 응답을 기다리는 시간에 대한 인자입니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1758006595374&quot; class=&quot;lasso&quot; style=&quot;background-color: #f8f8f8; color: #383a42; text-align: start;&quot; data-ke-language=&quot;cpp&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;// ceph/src/mon/Paxos.cc
void Paxos::extend_lease() {
  ...
  lease_expire = ceph::real_clock::now();
  lease_expire += ceph::make_timespan(g_conf()-&amp;gt;mon_lease);
  ...&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1758004899777&quot; class=&quot;cpp&quot; data-ke-language=&quot;cpp&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;void Paxos::extend_lease() {
  ...
  // set timeout event.
  if (!lease_ack_timeout_event) {
    lease_ack_timeout_event = mon.timer.add_event_after(
      g_conf()-&amp;gt;mon_lease_ack_timeout_factor * g_conf()-&amp;gt;mon_lease,
      new C_MonContext{&amp;amp;mon, [this](int r) {
	  if (r == -ECANCELED)
	    return;
	  lease_ack_timeout();
	}});
  }
  
 // set renew event
  auto at = lease_expire;
  at -= ceph::make_timespan(g_conf()-&amp;gt;mon_lease);
  at += ceph::make_timespan(g_conf()-&amp;gt;mon_lease_renew_interval_factor *
			    g_conf()-&amp;gt;mon_lease);
  lease_renew_event = mon.timer.add_event_at(
    at, new C_MonContext{&amp;amp;mon, [this](int r) {
	if (r == -ECANCELED)
	  return;
	lease_renew_timeout();
    }});
    ...&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;* lease 자체가 만료되면?&lt;br /&gt;&amp;nbsp;-&amp;gt; 그냥 에러 로그만 찍습니다.&lt;/p&gt;
&lt;pre id=&quot;code_1758012955599&quot; class=&quot;cpp&quot; data-ke-language=&quot;cpp&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;void Paxos::handle_lease(MonOpRequestRef op) {
  ...
  // extend lease
  if (auto new_expire = lease-&amp;gt;lease_timestamp.to_real_time();
      lease_expire &amp;lt; new_expire) {
    lease_expire = new_expire;

    auto now = ceph::real_clock::now();
    if (lease_expire &amp;lt; now) {
      auto diff = now - lease_expire;
      derr &amp;lt;&amp;lt; &quot;lease_expire from &quot; &amp;lt;&amp;lt; lease-&amp;gt;get_source_inst() &amp;lt;&amp;lt; &quot; is &quot; &amp;lt;&amp;lt; diff &amp;lt;&amp;lt; &quot; seconds in the past; mons are probably laggy (or possibly clocks are too skewed)&quot; &amp;lt;&amp;lt; dendl;
    }
  }&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Paxos timout에 의해 election이 발생하면 좀 더 민감한 timeout을 가집니다.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;mon_election_timeout (default 5)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1758014622039&quot; class=&quot;cpp&quot; data-ke-language=&quot;cpp&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;void Elector::reset_timer(double plus) {
  ...
  expire_event = mon-&amp;gt;timer.add_event_after(
    g_conf()-&amp;gt;mon_election_timeout + plus,  // plus가 있지만 사실상 0~1
    new C_MonContext{mon, [this](int) {
	logic.end_election_period();
      }});
}&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;안정적인 시스템을 운영하려면&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;Leader&lt;/b&gt;의 입장에서는 &lt;b&gt;peon으로부터 10초이내의 ack를 받아&lt;/b&gt;야 합니다.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Peon&lt;/b&gt;의 입장에서는 &lt;b&gt;ack를 보낸 이후 10초 이내에 renew 메시지가 도착&lt;/b&gt;하여야 합니다.&lt;/li&gt;
&lt;li&gt;Lease 갱신에 timeout이 발생하면 &lt;b&gt;election에서 5초이내에 leader가 victory에 대한 브로드캐스팅&lt;/b&gt;이 있어야합니다.&lt;/li&gt;
&lt;li&gt;따라서 가장 민감한 리더 선출과정과 lease 만료에 대한 에러로그를 남기지 않기 위해서 5초를 기준으로 설정하는게 좋지 않을까 합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;Reference&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/ceph/ceph/blob/f19a8ab044e4592db262ae2000083350981d9d0a/doc/rados/configuration/mon-config-ref.rst&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://github.com/ceph/ceph/blob/f19a8ab044e4592db262ae2000083350981d9d0a/doc/rados/configuration/mon-config-ref.rst&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;figure id=&quot;og_1756555482014&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;object&quot; data-og-title=&quot;ceph/doc/rados/configuration/mon-config-ref.rst at f19a8ab044e4592db262ae2000083350981d9d0a &amp;middot; ceph/ceph&quot; data-og-description=&quot;Ceph is a distributed object, block, and file storage platform - ceph/ceph&quot; data-og-host=&quot;github.com&quot; data-og-source-url=&quot;https://github.com/ceph/ceph/blob/f19a8ab044e4592db262ae2000083350981d9d0a/doc/rados/configuration/mon-config-ref.rst&quot; data-og-url=&quot;https://github.com/ceph/ceph/blob/f19a8ab044e4592db262ae2000083350981d9d0a/doc/rados/configuration/mon-config-ref.rst&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/TWwAd/hyZGlbv10F/GQTDjMFHNDncbcilH62xy0/img.png?width=1280&amp;amp;height=640&amp;amp;face=0_0_1280_640,https://scrap.kakaocdn.net/dn/cO2HE9/hyZC9cJ3tF/9FXv0bGlACkXQZgfoBQmB0/img.png?width=1280&amp;amp;height=640&amp;amp;face=0_0_1280_640&quot;&gt;&lt;a href=&quot;https://github.com/ceph/ceph/blob/f19a8ab044e4592db262ae2000083350981d9d0a/doc/rados/configuration/mon-config-ref.rst&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://github.com/ceph/ceph/blob/f19a8ab044e4592db262ae2000083350981d9d0a/doc/rados/configuration/mon-config-ref.rst&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/TWwAd/hyZGlbv10F/GQTDjMFHNDncbcilH62xy0/img.png?width=1280&amp;amp;height=640&amp;amp;face=0_0_1280_640,https://scrap.kakaocdn.net/dn/cO2HE9/hyZC9cJ3tF/9FXv0bGlACkXQZgfoBQmB0/img.png?width=1280&amp;amp;height=640&amp;amp;face=0_0_1280_640');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;ceph/doc/rados/configuration/mon-config-ref.rst at f19a8ab044e4592db262ae2000083350981d9d0a &amp;middot; ceph/ceph&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;Ceph is a distributed object, block, and file storage platform - ceph/ceph&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;github.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>System Engineering/Ceph</category>
      <author>Hopulence</author>
      <guid isPermaLink="true">https://hopulence.tistory.com/51</guid>
      <comments>https://hopulence.tistory.com/51#entry51comment</comments>
      <pubDate>Tue, 26 Aug 2025 21:56:43 +0900</pubDate>
    </item>
    <item>
      <title>InfiniBand에 대한 이해 (1) - 구조와 헤더</title>
      <link>https://hopulence.tistory.com/49</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;&lt;b&gt;1. InfiniBand(IB) 란?&lt;/b&gt;&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;AI 등을 서비스하는 HPC(High Performance Computing) 환경에서 Model이나 Checkpoint 등 TB 단위의 대용량 데이터 RDMA(Remote Dynamic Memory Access)를 위한 저지연 통신 표준입니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;RDMA란?
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;일반적인 패킷 인입은 ring buffer &amp;gt; DMA &amp;gt; CPU &amp;gt; Kernel buffer &amp;gt; Userspace로 전달되며, 이 과정에서 IRQ로 CPU context change가 발생하여 memcpy()가 이루어집니다.&lt;/li&gt;
&lt;li&gt;RDMA의 경우 패킷이 CPU로 인입되지 않고 NIC 또는 HCA에서 offload처리되어 memory에 직접 쓰여집니다. (Zero copy)
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;송신측의 HCA는 수신측의 가상 메모리 페이지와 물리 메모리를 매핑한 테이블을 전달 받아서 어떤 메모리 영역에 Read/Write 할지에 대한 Memory region(MR)을 인식합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1222&quot; data-origin-height=&quot;146&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/ntoDN/btsOWnZvfn3/3to1yuLG5t7dJeLWWL1CCK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/ntoDN/btsOWnZvfn3/3to1yuLG5t7dJeLWWL1CCK/img.png&quot; data-alt=&quot;패킷 인입 과정&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/ntoDN/btsOWnZvfn3/3to1yuLG5t7dJeLWWL1CCK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FntoDN%2FbtsOWnZvfn3%2F3to1yuLG5t7dJeLWWL1CCK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1222&quot; height=&quot;146&quot; data-origin-width=&quot;1222&quot; data-origin-height=&quot;146&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;패킷 인입 과정&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;3408&quot; data-origin-height=&quot;1092&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/7XDtA/btsO5CiJymu/5054jZ0xTXNEmU7YmbVg20/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/7XDtA/btsO5CiJymu/5054jZ0xTXNEmU7YmbVg20/img.png&quot; data-alt=&quot;일반 패킷 전송(TCP/UDP)와 RDMA&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/7XDtA/btsO5CiJymu/5054jZ0xTXNEmU7YmbVg20/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F7XDtA%2FbtsO5CiJymu%2F5054jZ0xTXNEmU7YmbVg20%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;3408&quot; height=&quot;1092&quot; data-origin-width=&quot;3408&quot; data-origin-height=&quot;1092&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;일반 패킷 전송(TCP/UDP)와 RDMA&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;&lt;b&gt;2. IB architecture &amp;amp; component&lt;/b&gt;&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1446&quot; data-origin-height=&quot;934&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/E3niN/btsOWJnGf7h/rH1tQGksaH5bQWPBoOjjxk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/E3niN/btsOWJnGf7h/rH1tQGksaH5bQWPBoOjjxk/img.png&quot; data-alt=&quot;Infiniband Component&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/E3niN/btsOWJnGf7h/rH1tQGksaH5bQWPBoOjjxk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FE3niN%2FbtsOWJnGf7h%2FrH1tQGksaH5bQWPBoOjjxk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1446&quot; height=&quot;934&quot; data-origin-width=&quot;1446&quot; data-origin-height=&quot;934&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;Infiniband Component&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;HCA : Host Channel Adaptor - Local IB NIC&lt;/li&gt;
&lt;li&gt;TCA : Target Channel Adaptor - Peer IB NIC&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;IB library and ULP(Upper Layer Protocol)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1448&quot; data-origin-height=&quot;1140&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/7busk/btsOUdxhUuT/5qUhJH3foPtJptX4dphr9k/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/7busk/btsOUdxhUuT/5qUhJH3foPtJptX4dphr9k/img.png&quot; data-alt=&quot;InfiniBand Software Stack&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/7busk/btsOUdxhUuT/5qUhJH3foPtJptX4dphr9k/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F7busk%2FbtsOUdxhUuT%2F5qUhJH3foPtJptX4dphr9k%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1448&quot; height=&quot;1140&quot; data-origin-width=&quot;1448&quot; data-origin-height=&quot;1140&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;InfiniBand Software Stack&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;IB의 Native API를&amp;nbsp; Verb라고 하며, IB는 Verb, IPoIB(IP over IB), OpenMPI, UCX 등의 ULP와 호환됩니다.&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&lt;b&gt;Subnet Manager(SM) (e.g. openSM)&lt;/b&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;SM은 망 내 IB Fabric을 관리하는 SDN controller로, IB 망에서는 하나의 subnet 내 최소 1개의 SM이 필요합니다.&amp;nbsp;&lt;br /&gt;(여러 개의 SM을 띄울수 있지만 Active는 1개입니다.)&lt;/li&gt;
&lt;li&gt;Subnet 내의 아무 device(IB switch or Host)에서 실행이 가능합니다.&amp;nbsp;&lt;/li&gt;
&lt;li&gt;Sty SM은 Active의 정보를 copy하여 유지하며 Act가 죽으면 절체됩니다.&lt;/li&gt;
&lt;li&gt;SM은 망 내 새로운 deivce가 감지되거면 LID를 할당하고, Link up/down이나 failover를 감지합니다. (Subnet topology discovery)
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;IB 패브릭은 MAD(Management Datagram)라는 프레임에 의해 관리됩니다.&lt;/li&gt;
&lt;li&gt;SM은 MAD 중 SMP(Subnet Management Packet)을 송수신하며 Discovery를 하고 LID를 할당합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;GID 기반의 DR(Directed Routing) 테이블(L3와 유사)과 LR(LID Routing) 테이블(L2와 유사)을 관리합니다.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;IB 통신에서 서브넷 내 16bit LID를 사용함으로써, 이더넷의 48bit MAC에 비해 단순한 해시 테이블을 가집니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;&lt;b&gt;3. IB Header&lt;/b&gt;&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1460&quot; data-origin-height=&quot;924&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bmnh6K/btsOVZSroPD/2WeedoHky3wp5iFwi6G2VK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bmnh6K/btsOVZSroPD/2WeedoHky3wp5iFwi6G2VK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bmnh6K/btsOVZSroPD/2WeedoHky3wp5iFwi6G2VK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbmnh6K%2FbtsOVZSroPD%2F2WeedoHky3wp5iFwi6G2VK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1460&quot; height=&quot;924&quot; data-origin-width=&quot;1460&quot; data-origin-height=&quot;924&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;&lt;b&gt;LRH(Local Routing Header)&lt;/b&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;Src. &amp;amp; Dst. LID
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;LID는 GUID별로 할당&lt;/li&gt;
&lt;li&gt;HCA 포트 당 하나의 스위치만 LID 할당(Unicast 용)&lt;/li&gt;
&lt;li&gt;HCA나 스위치의 전원이 내려가면 LID는 휘발&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;Service Level (SL)&lt;/li&gt;
&lt;li&gt;Virtual Lane (VL): QoS와 대응
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;VL 15 = SM traffic (high priority)&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;Packet length&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;&lt;b&gt;GRH(Global Routing Header)&lt;/b&gt; - optional
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;128bit IPv6 포맷의 GID 중 상위 64bit를 사용하여 subnet을 식별합니다.&lt;/li&gt;
&lt;li&gt;GID의 하위 64bit인 GUID(Globally Unique ID)는 HCA나 스위치 포트마다 제조사에 의해 부여된 고유값입니다.&lt;br /&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;IPv6 Multicast: FF12:601B:{P_key}::{Group ID}&lt;/li&gt;
&lt;li&gt;IPv4 Multicast: FF12:401B:{P_key}::{Group ID}&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;&lt;b&gt;BTH(Base Transfer Header)&lt;/b&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;Dst. QP와 PSN(Packet Sequence Number) 기반으로 packet ressemble&lt;/li&gt;
&lt;li&gt;P_key(Partition key): OSI L2의 VLAN에 해당하는 논리적 세그먼트.&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;&lt;b&gt;ETH(Extended Transport Header)&lt;/b&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;&lt;b&gt;4. IB Layer&lt;/b&gt;&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1436&quot; data-origin-height=&quot;922&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/ViuQn/btsOWDg4dUJ/58YsrUgXthckmcKrsLW66K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/ViuQn/btsOWDg4dUJ/58YsrUgXthckmcKrsLW66K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/ViuQn/btsOWDg4dUJ/58YsrUgXthckmcKrsLW66K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FViuQn%2FbtsOWDg4dUJ%2F58YsrUgXthckmcKrsLW66K%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1436&quot; height=&quot;922&quot; data-origin-width=&quot;1436&quot; data-origin-height=&quot;922&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&lt;b&gt;1. Physical layer&lt;/b&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;IB 케이블은 재질에 따라 AOC(optic)와 DAC(copper)가 있습니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1428&quot; data-origin-height=&quot;668&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/szjOl/btsOWC3vgVl/Hy8kL4zmjoTvzkxwg98AO0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/szjOl/btsOWC3vgVl/Hy8kL4zmjoTvzkxwg98AO0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/szjOl/btsOWC3vgVl/Hy8kL4zmjoTvzkxwg98AO0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FszjOl%2FbtsOWC3vgVl%2FHy8kL4zmjoTvzkxwg98AO0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;531&quot; height=&quot;248&quot; data-origin-width=&quot;1428&quot; data-origin-height=&quot;668&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Bandwidth에 따라 아래 처럼 나뉩니다.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;EDR(100G) - QSFP28&lt;/li&gt;
&lt;li&gt;HDR(200G) - QSFP56&lt;/li&gt;
&lt;li&gt;NDR(400G) - QSFP112 or QSFP56-DD&lt;/li&gt;
&lt;li&gt;XDR(800G) - OSFP or QSFP112-DD&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Connector specification&lt;/li&gt;
&lt;/ul&gt;
&lt;table style=&quot;border-collapse: collapse; width: 100%;&quot; border=&quot;1&quot; data-ke-align=&quot;alignLeft&quot; data-ke-style=&quot;style12&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;&lt;b&gt;Connector Cage&lt;/b&gt;&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;&lt;b&gt;Max bps/lane&lt;/b&gt;&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;&lt;b&gt;lanes&lt;/b&gt;&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;&lt;b&gt;Max B/W(bps)&lt;/b&gt;&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;&lt;b&gt;Compatible modules&lt;/b&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;QSFP56&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;50G&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;4&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;200G&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;QSFP56&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;QSFP56-DD&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;50G&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;8&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;400G&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;QSFP56&lt;br /&gt;QSFP56-DD&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;QSFP112&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;100G&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;4&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;400G&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;QSFP56&lt;br /&gt;QSFP112&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;QSFP112-DD&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;100G&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;8&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;800G&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;QSFP56&lt;br /&gt;QSFP56-DD&lt;br /&gt;QSFP112&lt;br /&gt;QSFP112-DD&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;OSFP&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;100G&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;8&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;800G&lt;/td&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;OSFP&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;IB Switch specification&lt;/li&gt;
&lt;/ul&gt;
&lt;table style=&quot;border-collapse: collapse; width: 100%; height: 102px;&quot; border=&quot;1&quot; data-ke-align=&quot;alignLeft&quot; data-ke-style=&quot;style12&quot;&gt;
&lt;tbody&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;&amp;nbsp;&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;&lt;b&gt;Q3000 Series&lt;/b&gt;&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;&lt;b&gt;MQM9700 Series&lt;/b&gt;&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;&lt;b&gt;QM8700 Series&lt;/b&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;Switch radix&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;36/144 XDR&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;64 NDR&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;40 HDR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;Connector type&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;OSFP/MPO&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;OSFP/QSFP112&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;QSFP56&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;Rack mount&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;2U/4U&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;1U&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;1U&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;Cooling&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;Air or Liquid&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;Air&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;Air&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;Management type&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;Managed only&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;Managed only&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 17px;&quot;&gt;Managed/Unmanaged&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&lt;b&gt;2. Link Layer&lt;/b&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Management와 Data 2가지 패킷이 있습니다.&lt;/li&gt;
&lt;li&gt;모든 Device는 동일 서브넷에서 SM에 의해 16bit짜리 LID를 할당받습니다.&lt;/li&gt;
&lt;li&gt;OSI L2와 유하게, 동일한 physical media에 VL(Virtual Lane)을 사용하여 Logical하게 분리하여 사용합니다.&lt;/li&gt;
&lt;li&gt;VL마다 SL(Service Level)로 Priority를 주어 QoS로 사용합니다.&lt;/li&gt;
&lt;li&gt;CRC(Cyclic Redundancy Check
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;데이터 무결성을 위한 checksum입니다.&lt;/li&gt;
&lt;li&gt;VCRC(Variant CRC)는 16bit로 Hop마다 계산됩니다.&lt;/li&gt;
&lt;li&gt;ICRC(Invariant CRC)는 32bit로 Hop 간 불변하는 End-to-End Checksum입니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&lt;b&gt;3. Network Layer&lt;/b&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;네트워크 계층은 서브넷 간 통신(GUID)을 위한 계층입니다&lt;/li&gt;
&lt;li&gt;GRH 내 128bit IPv6 포맷의 Src. &amp;amp; Dst. GID&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1751184828282&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;$ ibstatus
Infiniband device 'mlx5_0' port 1 status:
	default gid:	 fe80:0000:0000:0000:xxxx:xxxx:xxxx:xxxx	&amp;lt;&amp;lt;&amp;lt; IPv6 address
	base lid:	 0x1
	sm lid:		 0x11
	state:		 4: ACTIVE
	phys state:	 5: LinkUp
	rate:		 200 Gb/sec (4X HDR)
	link_layer:	 InfiniBand
...&lt;/code&gt;&lt;/pre&gt;
&lt;pre id=&quot;code_1751184885080&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;$ ibstat
CA 'mlx5_0'
	CA type: MT4123
	Number of ports: 1
	Firmware version: 20.39.1002
	Hardware version: 0
	Node GUID: 0xXXXXXXXXXXXXXXXX	&amp;lt;&amp;lt;&amp;lt; GUID&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1690&quot; data-origin-height=&quot;1114&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/pdm5I/btsOVIb1Vow/chqwJvsRmMCYEWWH4RG1QK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/pdm5I/btsOVIb1Vow/chqwJvsRmMCYEWWH4RG1QK/img.png&quot; data-alt=&quot;IB GID communication flow&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/pdm5I/btsOVIb1Vow/chqwJvsRmMCYEWWH4RG1QK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fpdm5I%2FbtsOVIb1Vow%2FchqwJvsRmMCYEWWH4RG1QK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1690&quot; height=&quot;1114&quot; data-origin-width=&quot;1690&quot; data-origin-height=&quot;1114&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;IB GID communication flow&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&lt;b&gt;4. Transport Layer&lt;/b&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;패킷 순서 보장, Partition, Transport service(RC/RD/UC/UD)를 위한 계층입니다.&lt;/li&gt;
&lt;li&gt;MTU 기반 Fragmentation과 Reassembly가 수행됩니다.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;IB의 MTU는 256/512/1024/2048/4096 Byte.&amp;nbsp;&lt;/li&gt;
&lt;li&gt;BTH 내 Dst. QP(Queue Pair)와 QPN(Queue Pair Number)를 참조하여 패킷을 재조립합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Packet Sequene Number(PSN)으로 패킷 순서를 보장합니다.&lt;/li&gt;
&lt;li&gt;수신측에서 보낸 Ack를 받으면 CQ(Complete Queue)에 상태를 업데이트합니다.&lt;/li&gt;
&lt;li&gt;여기까지의 모든 계층은 CPU 개입 없이 H/W에서 오프로드 처리됩니다.&lt;/li&gt;
&lt;li&gt;Credit Base Flow Control
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;End-to-End로 Lossless하게 데이터를 수신할 수 있는 buffer를 추산하기 위해 12bit credit 개념을 사용합니다.&lt;/li&gt;
&lt;li&gt;Credit은 송신측과 수신측은 서로의 가용 버퍼를 광고하여 업데이트됩니다.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;&amp;nbsp;* Queue Pair란?&lt;/b&gt;&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;IB는 연결 지향의 TCP 소켓과 유사하게 수신측의 버퍼(credit)을 광고받은 이후 SQ(Send Queue)와 RQ(Receive Queue) 가 쌍을 이루는 QP를 생성합니다. 일반적인 링 버퍼와 달리 FIFO구조의 queue이며, HCA에 의해 할당되는 24bit QPN으로 식별됩니다.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;QP0(0x000000)는 SMP 교환&lt;/li&gt;
&lt;li&gt;QP1(0x000001)은 MAD 교환&lt;/li&gt;
&lt;li&gt;0xFFFFFF는 멀티캐스트&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1688&quot; data-origin-height=&quot;662&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bsV9O3/btsOWea3K6l/y0Zq9hgiZTrIYIlbtdFgj0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bsV9O3/btsOWea3K6l/y0Zq9hgiZTrIYIlbtdFgj0/img.png&quot; data-alt=&quot;Queue Pair 구조&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bsV9O3/btsOWea3K6l/y0Zq9hgiZTrIYIlbtdFgj0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbsV9O3%2FbtsOWea3K6l%2Fy0Zq9hgiZTrIYIlbtdFgj0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1688&quot; height=&quot;662&quot; data-origin-width=&quot;1688&quot; data-origin-height=&quot;662&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;Queue Pair 구조&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;QP 생성과 RDMA 과정&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;QP가 생성되면 HCA는 상대방의 메모리 영역을 알기위해 MR을 등록하고 R_Key(Remote Key)를 함께 구조체에 저장합니다.&lt;/li&gt;
&lt;li&gt;Remote read/write를 하려는 어플리케이션은 Work Request(WR)를 작성하고 HCA로 post합니다.&lt;/li&gt;
&lt;li&gt;HCA는 WR을 Work Queue Element(WQE)로 변환하여 SQ/RQ에 적재하여 비동기적으로 처리합니다.&lt;/li&gt;
&lt;li&gt;처리가 완료된 WR은 QP외부의 Complete Queue(CQ)에 CQE로 변환되어 쌓입니다. 이후 ibv_poll_cq()에 의해 어플리케이션에 Work Completion(WC)로 전달됩니다.&lt;br /&gt;(CQE와 WC의 개념적 차이는 거의 없다고 하네요...)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;750&quot; data-origin-height=&quot;1012&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/byUaPc/btsO5BqHqtG/DAHatV5SkUYWZmRW1iDPyk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/byUaPc/btsO5BqHqtG/DAHatV5SkUYWZmRW1iDPyk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/byUaPc/btsO5BqHqtG/DAHatV5SkUYWZmRW1iDPyk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbyUaPc%2FbtsO5BqHqtG%2FDAHatV5SkUYWZmRW1iDPyk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;480&quot; height=&quot;648&quot; data-origin-width=&quot;750&quot; data-origin-height=&quot;1012&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;QP와 CQ는 서로 독립적이며, QP마다 CQ를 따로 사용하거나 공유할 수 있습니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;table style=&quot;border-collapse: collapse; width: 100%; height: 17px;&quot; border=&quot;1&quot; data-ke-align=&quot;alignLeft&quot;&gt;
&lt;tbody&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 50%; height: 17px;&quot;&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;826&quot; data-origin-height=&quot;460&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/eDSmSg/btsQDd9TRTG/RURwPehkAaimjVPNkIRAy1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/eDSmSg/btsQDd9TRTG/RURwPehkAaimjVPNkIRAy1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/eDSmSg/btsQDd9TRTG/RURwPehkAaimjVPNkIRAy1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FeDSmSg%2FbtsQDd9TRTG%2FRURwPehkAaimjVPNkIRAy1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;826&quot; height=&quot;460&quot; data-origin-width=&quot;826&quot; data-origin-height=&quot;460&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/td&gt;
&lt;td style=&quot;width: 50%; height: 17px;&quot;&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;786&quot; data-origin-height=&quot;442&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/xpy5X/btsQEFxyFQx/g7kJeyR12tKK2TknXTmEXk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/xpy5X/btsQEFxyFQx/g7kJeyR12tKK2TknXTmEXk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/xpy5X/btsQEFxyFQx/g7kJeyR12tKK2TknXTmEXk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fxpy5X%2FbtsQEFxyFQx%2Fg7kJeyR12tKK2TknXTmEXk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;786&quot; height=&quot;442&quot; data-origin-width=&quot;786&quot; data-origin-height=&quot;442&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1486&quot; data-origin-height=&quot;1274&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/6F2W4/btsQS3MIoMu/NuD2UPtot3zL93uHIzG2NK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/6F2W4/btsQS3MIoMu/NuD2UPtot3zL93uHIzG2NK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/6F2W4/btsQS3MIoMu/NuD2UPtot3zL93uHIzG2NK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F6F2W4%2FbtsQS3MIoMu%2FNuD2UPtot3zL93uHIzG2NK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1486&quot; height=&quot;1274&quot; data-origin-width=&quot;1486&quot; data-origin-height=&quot;1274&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1668&quot; data-origin-height=&quot;744&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b4lD1Q/btsQTjBBYTl/tLKxxP7KQUjlTLmRg9Rbu1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b4lD1Q/btsQTjBBYTl/tLKxxP7KQUjlTLmRg9Rbu1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b4lD1Q/btsQTjBBYTl/tLKxxP7KQUjlTLmRg9Rbu1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb4lD1Q%2FbtsQTjBBYTl%2FtLKxxP7KQUjlTLmRg9Rbu1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1668&quot; height=&quot;744&quot; data-origin-width=&quot;1668&quot; data-origin-height=&quot;744&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Work Request와 Memory Region
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;WR 버퍼는 sg_list(Scatter/Gather)을 엔트리로 MR 영역이 불연속적이더라도 copy없이 한 번에 송/수신할 수 있습니다.&lt;/li&gt;
&lt;li&gt;&amp;nbsp;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1012&quot; data-origin-height=&quot;690&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/D2oPE/btsQFOgc6Ls/lSku819qIJbjs63Lip8Tik/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/D2oPE/btsQFOgc6Ls/lSku819qIJbjs63Lip8Tik/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/D2oPE/btsQFOgc6Ls/lSku819qIJbjs63Lip8Tik/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FD2oPE%2FbtsQFOgc6Ls%2FlSku819qIJbjs63Lip8Tik%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;522&quot; height=&quot;356&quot; data-origin-width=&quot;1012&quot; data-origin-height=&quot;690&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 style=&quot;color: #000000;&quot; data-ke-size=&quot;size20&quot;&gt;* Completion Wait Model&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;데이터 송수신이 완료되었는지는 Polling과 Interrupt 두 가지 방식으로 확인합니다.&lt;/li&gt;
&lt;li&gt;Polling
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;어플리케이션에서 CQ를 확인합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Completion Channel&amp;nbsp;Interrupt:
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;HCA에서 완료 이벤트 트리거하고 Completion Channel로 Interrupt합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;QP 서비스 타입 (RC/UD)&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;QP는 &lt;b&gt;RC&lt;/b&gt;/RD/&lt;b&gt;UD&lt;/b&gt;/UC 네 가지 서비스 타입의 통신 모드를 지원합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;table style=&quot;border-collapse: collapse; width: 100%; height: 159px;&quot; border=&quot;1&quot; data-ke-style=&quot;style12&quot; data-ke-align=&quot;alignLeft&quot;&gt;
&lt;tbody&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 20%; height: 17px;&quot;&gt;Service Type&lt;/td&gt;
&lt;td style=&quot;width: 20%; height: 17px;&quot;&gt;&lt;b&gt;Reliable Connection&lt;/b&gt;&lt;/td&gt;
&lt;td style=&quot;width: 20%; height: 17px;&quot;&gt;&lt;s&gt;Reliable Datagram&lt;/s&gt;&lt;/td&gt;
&lt;td style=&quot;width: 20%; height: 17px;&quot;&gt;&lt;b&gt;Unreliable Datagram&lt;/b&gt;&lt;/td&gt;
&lt;td style=&quot;width: 20%; height: 17px;&quot;&gt;Unreliable Connection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 19px;&quot;&gt;
&lt;td style=&quot;width: 20%; height: 19px;&quot;&gt;오류 검출&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 19px;&quot;&gt;O&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 19px;&quot;&gt;O&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 19px;&quot;&gt;X&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 19px;&quot;&gt;O&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 20%; height: 17px;&quot;&gt;Error recovery&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;패킷 재전송&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;패킷 재전송&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;X&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;X&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 20%; height: 17px;&quot;&gt;Multicast&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;X&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;X&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;O&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;X&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 20%; height: 17px;&quot;&gt;Shared receive queue&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;O&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;X&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;O&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;X&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 20%; height: 17px;&quot;&gt;Max message size&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;-&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;-&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;4,096 bytes&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 20%; height: 17px;&quot;&gt;RDMA Support&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;O&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;O&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;X&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 17px;&quot;&gt;Write only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 19px;&quot;&gt;
&lt;td style=&quot;width: 20%; height: 19px;&quot;&gt;연결 유형&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 19px;&quot;&gt;1:1&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 19px;&quot;&gt;-&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 19px;&quot;&gt;1:N&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center; height: 19px;&quot;&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 20%;&quot;&gt;&lt;span style=&quot;background-color: #efefef; color: #333333; text-align: start;&quot;&gt;TCP로 비유하면?&lt;/span&gt;&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center;&quot;&gt;TCP&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center;&quot;&gt;-&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center;&quot;&gt;UDP&lt;/td&gt;
&lt;td style=&quot;width: 20%; text-align: center;&quot;&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;RC
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;송신측 HCA에서 WQE를 전송한 이후 수신측의 ACK를 받아야 CQ로 적재합니다.&lt;/li&gt;
&lt;li&gt;1:1 통신을 하며 통신이 필요한 프로세스마다 각가 QP 생성이 필요합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;UD
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;송신측 HCA가 WQE를 전송하고 송신측의 CQ에 그대로 적재합니다.&lt;/li&gt;
&lt;li&gt;Multicast로 1:N 통신이 가능합니다.&lt;/li&gt;
&lt;li&gt;OSI L3와 호환되는 IPoIB(IP over IB)에 사용됩니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;2660&quot; data-origin-height=&quot;1448&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/desZnV/btsO6DnyweJ/t77gnlzwKvAhazKZQt35w1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/desZnV/btsO6DnyweJ/t77gnlzwKvAhazKZQt35w1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/desZnV/btsO6DnyweJ/t77gnlzwKvAhazKZQt35w1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdesZnV%2FbtsO6DnyweJ%2Ft77gnlzwKvAhazKZQt35w1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;2660&quot; height=&quot;1448&quot; data-origin-width=&quot;2660&quot; data-origin-height=&quot;1448&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;3184&quot; data-origin-height=&quot;1618&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bN7aKP/btsO5NEv6TS/G9K78WK9tqfN2sho8ZDZgK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bN7aKP/btsO5NEv6TS/G9K78WK9tqfN2sho8ZDZgK/img.png&quot; data-alt=&quot;RC / UD QP Connection&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bN7aKP/btsO5NEv6TS/G9K78WK9tqfN2sho8ZDZgK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbN7aKP%2FbtsO5NEv6TS%2FG9K78WK9tqfN2sho8ZDZgK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;3184&quot; height=&quot;1618&quot; data-origin-width=&quot;3184&quot; data-origin-height=&quot;1618&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;RC / UD QP Connection&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;IB operation&lt;/li&gt;
&lt;/ul&gt;
&lt;table style=&quot;border-collapse: collapse; width: 97.093%; height: 102px;&quot; border=&quot;1&quot; data-ke-align=&quot;alignLeft&quot; data-ke-style=&quot;style12&quot;&gt;
&lt;tbody&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 33.3333%; height: 17px; text-align: center;&quot;&gt;Operation&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%; height: 17px; text-align: center;&quot;&gt;UD&lt;/td&gt;
&lt;td style=&quot;width: 30.4263%; height: 17px; text-align: center;&quot;&gt;RC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 33.3333%; height: 17px; text-align: center;&quot;&gt;SEND&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%; height: 17px; text-align: center;&quot;&gt;O&lt;/td&gt;
&lt;td style=&quot;width: 30.4263%; height: 17px; text-align: center;&quot;&gt;O&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 33.3333%; height: 17px; text-align: center;&quot;&gt;RDMA WRITE&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%; height: 17px; text-align: center;&quot;&gt;X&lt;/td&gt;
&lt;td style=&quot;width: 30.4263%; height: 17px; text-align: center;&quot;&gt;O&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 33.3333%; height: 17px; text-align: center;&quot;&gt;RDMA WRITE w/lmm&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%; height: 17px; text-align: center;&quot;&gt;X&lt;/td&gt;
&lt;td style=&quot;width: 30.4263%; height: 17px; text-align: center;&quot;&gt;O&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 33.3333%; height: 17px; text-align: center;&quot;&gt;RDMA READ&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%; height: 17px; text-align: center;&quot;&gt;X&lt;/td&gt;
&lt;td style=&quot;width: 30.4263%; height: 17px; text-align: center;&quot;&gt;O&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 33.3333%; height: 17px; text-align: center;&quot;&gt;Atomic Operation&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%; height: 17px; text-align: center;&quot;&gt;X&lt;/td&gt;
&lt;td style=&quot;width: 30.4263%; height: 17px; text-align: center;&quot;&gt;O&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;table style=&quot;border-collapse: collapse; width: 100%; height: 529px;&quot; border=&quot;1&quot; data-ke-align=&quot;alignLeft&quot;&gt;
&lt;tbody&gt;
&lt;tr style=&quot;height: 241px;&quot;&gt;
&lt;td style=&quot;width: 50%; height: 241px;&quot;&gt;&lt;b&gt;1. SEND&lt;/b&gt;&lt;br /&gt;&amp;nbsp;- 송신측에서 송신할 데이터의 MR을 지정합니다.&lt;br /&gt;&amp;nbsp;- 수신측에서 수신할 MR을 지정하고 수신에 성공 시 CQ에 쌓입니다.&lt;br /&gt;&lt;br /&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;906&quot; data-origin-height=&quot;236&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/caK9qY/btsQFtcmHHG/u3akQgwxr1LknF3gB7Zqi0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/caK9qY/btsQFtcmHHG/u3akQgwxr1LknF3gB7Zqi0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/caK9qY/btsQFtcmHHG/u3akQgwxr1LknF3gB7Zqi0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcaK9qY%2FbtsQFtcmHHG%2Fu3akQgwxr1LknF3gB7Zqi0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;664&quot; height=&quot;173&quot; data-origin-width=&quot;906&quot; data-origin-height=&quot;236&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;br /&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 237px;&quot;&gt;
&lt;td style=&quot;width: 50%; height: 237px;&quot;&gt;&lt;b&gt;2. RDMA WRITE&lt;br /&gt;&lt;/b&gt;&amp;nbsp;- 송신측은 데이터를 수신할 MR 지정합니다.&lt;br /&gt;&amp;nbsp;- 이후 Send WR에 수신측 메모리 주소와 길이를 설정합니다.&lt;br /&gt;&amp;nbsp;- 수신측은 RQ를 거치지 않고 메모리에 바로 쓰입니다.&lt;br /&gt;&amp;nbsp;- 수신측은 ACK를 보내지 않으며, 실패 시 비동기 오류를 트리거합니다.&lt;br /&gt;&lt;br /&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;912&quot; data-origin-height=&quot;240&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/ytKY9/btsQEEMd8ny/CgTn9NBmpSvSomdKcjU4kk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/ytKY9/btsQEEMd8ny/CgTn9NBmpSvSomdKcjU4kk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/ytKY9/btsQEEMd8ny/CgTn9NBmpSvSomdKcjU4kk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FytKY9%2FbtsQEEMd8ny%2FCgTn9NBmpSvSomdKcjU4kk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;912&quot; height=&quot;240&quot; data-origin-width=&quot;912&quot; data-origin-height=&quot;240&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 50%; height: 17px;&quot;&gt;&lt;b&gt;&lt;b&gt;3. RDMA WRITE with Immediate&lt;br /&gt;&lt;/b&gt;&lt;/b&gt;&amp;nbsp;- RDMA Write는 수신측에서 자신의 메모리에 데이터가 쓰여진 것을 알 수 없습니다.&amp;nbsp;&lt;br /&gt;- 그래서 WR의 imm_data에 32bit 값을 함께 전송하고, 수신측은 CQ로 이벤트를 전달받을 때 이 값을 통해 자신의 메모리가 쓰여졌음을 알 수 있습니다.&lt;b&gt;&lt;b&gt;&lt;br /&gt;&lt;br /&gt;&lt;/b&gt;&lt;/b&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;898&quot; data-origin-height=&quot;228&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/btkXZI/btsQFtDtg0T/Wuo1vrt5oUIuQfvKk7DYaK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/btkXZI/btsQFtDtg0T/Wuo1vrt5oUIuQfvKk7DYaK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/btkXZI/btsQFtDtg0T/Wuo1vrt5oUIuQfvKk7DYaK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbtkXZI%2FbtsQFtDtg0T%2FWuo1vrt5oUIuQfvKk7DYaK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;898&quot; height=&quot;228&quot; data-origin-width=&quot;898&quot; data-origin-height=&quot;228&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 50%; height: 17px;&quot;&gt;&lt;b&gt;&lt;b&gt;4. RDMA READ&lt;br /&gt;&lt;/b&gt;&lt;/b&gt;&amp;nbsp;- RDMA WRITE와 동일한 과정을 거칩니다.&lt;br /&gt;&lt;b&gt;&lt;b&gt;&lt;br /&gt;&lt;br /&gt;&lt;/b&gt;&lt;/b&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;914&quot; data-origin-height=&quot;228&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bdhk2R/btsQCwBLQSY/4wqmz1a0UdLh1rrY5h7KfK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bdhk2R/btsQCwBLQSY/4wqmz1a0UdLh1rrY5h7KfK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bdhk2R/btsQCwBLQSY/4wqmz1a0UdLh1rrY5h7KfK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbdhk2R%2FbtsQCwBLQSY%2F4wqmz1a0UdLh1rrY5h7KfK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;914&quot; height=&quot;228&quot; data-origin-width=&quot;914&quot; data-origin-height=&quot;228&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 17px;&quot;&gt;
&lt;td style=&quot;width: 50%; height: 17px;&quot;&gt;&lt;b&gt;&lt;b&gt;&lt;b&gt;&lt;b&gt;&lt;b&gt;5. ATOMIC Operations&lt;br /&gt;&lt;/b&gt;&lt;/b&gt;&lt;/b&gt;&lt;/b&gt;&lt;/b&gt;&amp;nbsp;- 이 동작에 해당하는 명령어는 &lt;b&gt;Fetch and Add(FAA)&lt;/b&gt;, &lt;b&gt;Compare and Swap(CAS&lt;/b&gt;)이 있습니다.&lt;br /&gt;&amp;nbsp;- FAA&lt;br /&gt;&amp;nbsp; &amp;nbsp; 송신측은 메모리 주소, 증가 값을 보냅니다. 그리고 수신측은 해당 메모리 주소에 증가값을 더하여 저장합니다.&lt;br /&gt;&amp;nbsp;- CAS&lt;br /&gt;&amp;nbsp; &amp;nbsp; 송신측은 매모리 주소, 현재 값, 변경할 값을 보냅니다. 그리고 수신측의 현재 값이 송신측의 현재 값과 같다면 변경할 값으로 교체합니다.&lt;br /&gt;&lt;br /&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;912&quot; data-origin-height=&quot;228&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b7GX09/btsQSyEZhMJ/zqDXSFhnn6SKB6660PxYc0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b7GX09/btsQSyEZhMJ/zqDXSFhnn6SKB6660PxYc0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b7GX09/btsQSyEZhMJ/zqDXSFhnn6SKB6660PxYc0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb7GX09%2FbtsQSyEZhMJ%2FzqDXSFhnn6SKB6660PxYc0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;912&quot; height=&quot;228&quot; data-origin-width=&quot;912&quot; data-origin-height=&quot;228&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;b&gt;&lt;b&gt;&lt;b&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/b&gt;&lt;/b&gt;&lt;/b&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p style=&quot;position: absolute;&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&amp;nbsp;&lt;/h4&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;* IB Multicast&lt;/b&gt;&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;IB에서 멀티캐스트는 UD만 가능합니다.&lt;/li&gt;
&lt;li&gt;Multicast 용도로 정의된 LID(0xC000 ~ 0xFFFE)를 사용합니다.&lt;/li&gt;
&lt;li&gt;Multicast에 참여하는 노드는 SM에 자신을 등록합니다. 그리고 SM는 각 스위치에 Multicast table를 전달합니다.&lt;/li&gt;
&lt;li&gt;송신측은 QPN을 알 수 없으므로 0xFFFFFF로 송신하며, 각 노드는 Multicast 수신을 원하는 QPN을 등록합니다.&lt;/li&gt;
&lt;li&gt;Multicast를 수신한 HCA는 등록된 모든 QP에 패킷을 복제하여 전달합니다.&lt;/li&gt;
&lt;li&gt;Subnet 간 통신에는 멀티캐스트 GID를 사용합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;* Protection Domain&lt;/b&gt;&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;메모리 영역과 관련된 Local Key와 Remote Key를 통해 메모리 영역을 보호합니다.&lt;/li&gt;
&lt;li&gt;L_key는 HCA가 local memory에 접근하기 위해 필요하며, 가상 메모리에서 물리 메모리를 찾아가기 위해 사용합니다.&lt;/li&gt;
&lt;li&gt;R_key는 Remote 측의 메모리에 접근하기 위해 필요합니다.&lt;/li&gt;
&lt;li&gt;RDMA Read/Write를 하는 경우 R_Key를 Send WR에 포함하여 패킷을 전송하며, Key가 맞지 않는 경우 packet을 discard하고 NAK를 보냅니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;836&quot; data-origin-height=&quot;592&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/c72gza/btsQTTPbBDA/hQC0qhsI2jq4vJYNUvsGv1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/c72gza/btsQTTPbBDA/hQC0qhsI2jq4vJYNUvsGv1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/c72gza/btsQTTPbBDA/hQC0qhsI2jq4vJYNUvsGv1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fc72gza%2FbtsQTTPbBDA%2FhQC0qhsI2jq4vJYNUvsGv1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;650&quot; height=&quot;460&quot; data-origin-width=&quot;836&quot; data-origin-height=&quot;592&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;* Partition&lt;/b&gt;&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;16bit P_Key로 IB를 L2 VLAN처럼 사용 가능하도록 합니다.&lt;/li&gt;
&lt;li&gt;하지만 HCA에서 SMP로 P_key를 변경하는 것이 가능하여 보안 수준은 더 낮습니다.&lt;/li&gt;
&lt;li&gt;IPoIB의 VLAN을 구현하는데 사용됩니다.&lt;/li&gt;
&lt;li&gt;HCA에서 최대 128개의 P_key 테이블을 가질 수 있으며, P_key는 QP마다 각각 적용됩니다.&lt;/li&gt;
&lt;li&gt;Partition이 지정되지 않은 QP는 Default P_kery(0xFFFF)로 지정됩니다.&lt;/li&gt;
&lt;li&gt;수신측은 같은 partition에 속하지 않는 QP가 보낸 패킷을 discard합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;894&quot; data-origin-height=&quot;528&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/8TOJf/btsQTA29LdX/ZxBpfpCIWz7WaaUMzK7R60/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/8TOJf/btsQTA29LdX/ZxBpfpCIWz7WaaUMzK7R60/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/8TOJf/btsQTA29LdX/ZxBpfpCIWz7WaaUMzK7R60/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F8TOJf%2FbtsQTA29LdX%2FZxBpfpCIWz7WaaUMzK7R60%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;393&quot; height=&quot;232&quot; data-origin-width=&quot;894&quot; data-origin-height=&quot;528&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;partition은 &lt;a href=&quot;https://github.com/linux-rdma/opensm/blob/master/doc/partition-config.txt&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;SM에서 설정&lt;/a&gt; 가능합니다.&amp;nbsp;&lt;/li&gt;
&lt;li&gt;같은 Partition이더라도 GID에 따라 &lt;b&gt;Full membership&lt;/b&gt;(default)과&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;b&gt;Limited membership&lt;/b&gt;으로 노드 간 통신 가능 여부를 설정할 수 있습니다.&lt;br /&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Full &amp;lt;-&amp;gt; Full : 통신 가능&lt;/li&gt;
&lt;li&gt;Limited &amp;lt;-&amp;gt; Full : 통신 가능&lt;/li&gt;
&lt;li&gt;Limited &amp;lt;-&amp;gt; Limited : 통신 불가능&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&lt;b&gt;Reference&lt;/b&gt;&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://www.nminoru.jp/~nminoru/network/infiniband/iba-concept.html&quot;&gt;https://www.nminoru.jp/~nminoru/network/infiniband/iba-concept.html&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1751094648592&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;InfiniBand プログラムに必要な基本的な概念&quot; data-og-description=&quot;このページでは InfiniBand Verbs プログラムをはじめるのに必要な基本的な概念を説明する。 この文書の情報は RDMA_CM API を使ったプログラムをする場合にも有効だと思われる。 以下は関連ペー&quot; data-og-host=&quot;www.nminoru.jp&quot; data-og-source-url=&quot;https://www.nminoru.jp/~nminoru/network/infiniband/iba-concept.html&quot; data-og-url=&quot;https://www.nminoru.jp/~nminoru/network/infiniband/iba-concept.html&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/bqeVnX/hyZceLWuHF/bfqfkEius6pbWdJZpJnFeK/img.png?width=616&amp;amp;height=453&amp;amp;face=0_0_616_453,https://scrap.kakaocdn.net/dn/bCGA1L/hyZcj7zIgR/SkJ9qwkckdQmiQGKg7o9i0/img.png?width=603&amp;amp;height=453&amp;amp;face=0_0_603_453,https://scrap.kakaocdn.net/dn/VqqAT/hyZcfjMDUh/TfnL0UuCO9l41zlAJAMfBk/img.png?width=617&amp;amp;height=392&amp;amp;face=0_0_617_392&quot;&gt;&lt;a href=&quot;https://www.nminoru.jp/~nminoru/network/infiniband/iba-concept.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://www.nminoru.jp/~nminoru/network/infiniband/iba-concept.html&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/bqeVnX/hyZceLWuHF/bfqfkEius6pbWdJZpJnFeK/img.png?width=616&amp;amp;height=453&amp;amp;face=0_0_616_453,https://scrap.kakaocdn.net/dn/bCGA1L/hyZcj7zIgR/SkJ9qwkckdQmiQGKg7o9i0/img.png?width=603&amp;amp;height=453&amp;amp;face=0_0_603_453,https://scrap.kakaocdn.net/dn/VqqAT/hyZcfjMDUh/TfnL0UuCO9l41zlAJAMfBk/img.png?width=617&amp;amp;height=392&amp;amp;face=0_0_617_392');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;InfiniBand プログラムに必要な基本的な概念&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;このページでは InfiniBand Verbs プログラムをはじめるのに必要な基本的な概念を説明する。 この文書の情報は RDMA_CM API を使ったプログラムをする場合にも有効だと思われる。 以下は関連ペー&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;www.nminoru.jp&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://network.nvidia.com/pdf/whitepapers/IB_Intro_WP_190.pdf&quot;&gt;https://network.nvidia.com/pdf/whitepapers/IB_Intro_WP_190.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/other/instinct-mi300-series-cluster-reference-guide.pdf&quot;&gt;https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/other/instinct-mi300-series-cluster-reference-guide.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf&quot;&gt;https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://www.ufispace.com/company/blog/compare-infiniband-vs-ethernet&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://www.ufispace.com/company/blog/compare-infiniband-vs-ethernet&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1751186681295&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;Infiniband vs Ethernet: Performance, Scalability, and Cost&quot; data-og-description=&quot;&quot; data-og-host=&quot;www.ufispace.com&quot; data-og-source-url=&quot;https://www.ufispace.com/company/blog/compare-infiniband-vs-ethernet&quot; data-og-url=&quot;https://www.ufispace.com/company/blog/compare-infiniband-vs-ethernet&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/cvfOjf/hyZgbmCf85/3MrlFLBp4YeUKcbmMSO631/img.jpg?width=1000&amp;amp;height=667&amp;amp;face=0_0_1000_667&quot;&gt;&lt;a href=&quot;https://www.ufispace.com/company/blog/compare-infiniband-vs-ethernet&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://www.ufispace.com/company/blog/compare-infiniband-vs-ethernet&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/cvfOjf/hyZgbmCf85/3MrlFLBp4YeUKcbmMSO631/img.jpg?width=1000&amp;amp;height=667&amp;amp;face=0_0_1000_667');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Infiniband vs Ethernet: Performance, Scalability, and Cost&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;www.ufispace.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc4755&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://datatracker.ietf.org/doc/html/rfc4755&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1751701507628&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;RFC 4755: IP over InfiniBand: Connected Mode&quot; data-og-description=&quot;This document specifies transmission of IPv4/IPv6 packets and address resolution over the connected modes of InfiniBand. [STANDARDS-TRACK]&quot; data-og-host=&quot;datatracker.ietf.org&quot; data-og-source-url=&quot;https://datatracker.ietf.org/doc/html/rfc4755&quot; data-og-url=&quot;https://datatracker.ietf.org/doc/html/rfc4755&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/9kqrX/hyZfYoyclB/m3SWnNkKRaIEJzM9q4qQsk/img.png?width=1200&amp;amp;height=630&amp;amp;face=0_0_1200_630&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc4755&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://datatracker.ietf.org/doc/html/rfc4755&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/9kqrX/hyZfYoyclB/m3SWnNkKRaIEJzM9q4qQsk/img.png?width=1200&amp;amp;height=630&amp;amp;face=0_0_1200_630');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;RFC 4755: IP over InfiniBand: Connected Mode&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;This document specifies transmission of IPv4/IPv6 packets and address resolution over the connected modes of InfiniBand. [STANDARDS-TRACK]&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;datatracker.ietf.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc4391&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://datatracker.ietf.org/doc/html/rfc4391&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1751701530183&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;RFC 4391: Transmission of IP over InfiniBand (IPoIB)&quot; data-og-description=&quot;This document specifies a method for encapsulating and transmitting IPv4/IPv6 and Address Resolution Protocol (ARP) packets over InfiniBand (IB). It describes the link-layer address to be used when resolving the IP addresses in IP over InfiniBand (IPoIB) s&quot; data-og-host=&quot;datatracker.ietf.org&quot; data-og-source-url=&quot;https://datatracker.ietf.org/doc/html/rfc4391&quot; data-og-url=&quot;https://datatracker.ietf.org/doc/html/rfc4391&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/IidGU/hyZf8SdQyO/Qk8UV82sfpdXQGRoYFcNL0/img.png?width=1200&amp;amp;height=630&amp;amp;face=0_0_1200_630&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc4391&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://datatracker.ietf.org/doc/html/rfc4391&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/IidGU/hyZf8SdQyO/Qk8UV82sfpdXQGRoYFcNL0/img.png?width=1200&amp;amp;height=630&amp;amp;face=0_0_1200_630');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;RFC 4391: Transmission of IP over InfiniBand (IPoIB)&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;This document specifies a method for encapsulating and transmitting IPv4/IPv6 and Address Resolution Protocol (ARP) packets over InfiniBand (IB). It describes the link-layer address to be used when resolving the IP addresses in IP over InfiniBand (IPoIB) s&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;datatracker.ietf.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://www.rfc-editor.org/rfc/rfc4392.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://www.rfc-editor.org/rfc/rfc4392.html&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1751702413080&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;RFC 4392: IP over InfiniBand (IPoIB) Architecture&quot; data-og-description=&quot;&quot; data-og-host=&quot;www.rfc-editor.org&quot; data-og-source-url=&quot;https://www.rfc-editor.org/rfc/rfc4392.html&quot; data-og-url=&quot;https://www.rfc-editor.org/rfc/rfc4392.html&quot; data-og-image=&quot;&quot;&gt;&lt;a href=&quot;https://www.rfc-editor.org/rfc/rfc4392.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://www.rfc-editor.org/rfc/rfc4392.html&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url();&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;RFC 4392: IP over InfiniBand (IPoIB) Architecture&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;www.rfc-editor.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://iris.polito.it/retrieve/e384c430-ea64-d4b2-e053-9f05fe0a1d67/Final_thesis.pdf&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://iris.polito.it/retrieve/e384c430-ea64-d4b2-e053-9f05fe0a1d67/Final_thesis.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>System Engineering/Network</category>
      <author>Hopulence</author>
      <guid isPermaLink="true">https://hopulence.tistory.com/49</guid>
      <comments>https://hopulence.tistory.com/49#entry49comment</comments>
      <pubDate>Sat, 28 Jun 2025 17:12:22 +0900</pubDate>
    </item>
    <item>
      <title>Kubelet MCE Memory Error - EDAC</title>
      <link>https://hopulence.tistory.com/47</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;운용중인 시스템에 'Handling MCE Memory Error'라는 문구와 함께 여러 에러가 발생했습니다. 결론적으로 하드웨어 문제는 아니었지만 찾아본 내용을 정리한 내용입니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;* MCE = Machine Check Error&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1289&quot; data-origin-height=&quot;174&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/kFpom/btsJLd1M0nC/NbCd32fKOPgRUDRduAoEMK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/kFpom/btsJLd1M0nC/NbCd32fKOPgRUDRduAoEMK/img.png&quot; data-alt=&quot;describe node 결과&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/kFpom/btsJLd1M0nC/NbCd32fKOPgRUDRduAoEMK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FkFpom%2FbtsJLd1M0nC%2FNbCd32fKOPgRUDRduAoEMK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;2343&quot; height=&quot;316&quot; data-origin-width=&quot;1289&quot; data-origin-height=&quot;174&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;describe node 결과&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;EDAC(Error Detection and Corredtion)&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;EDAC는 CPU Cache, Memory, GPU, PCI bus 등과 같은 하드웨어의 에러를 감지하고 가능하다면 이를 수정하기 위한 커널 모듈입니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;아래는 커널 문서를 정리한 내용입니다. &lt;br /&gt;/Documentation/driver-api/edac.rst&lt;br /&gt;/Documentation/admin-guide/ras.rst&lt;br /&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;CPU가 Memory Controller(이하 MC)로 데이터를 쓸 때, MC는 실시간으로 Hamming code 또는 SECDED+를 사용해서 syndrom이라는 것을 계산하여 데이터의 Total width를 산출합니다.&lt;/li&gt;
&lt;li&gt;아래 dmidecode 명령어의 결과에서 해당 메모리의 Total width는 72bit, Data width는 64bit입니다. 즉, 여기서 여분의 12bit를 에러 감지와 수정에 사용할 수 있으며 이를 syndrome(ECC memory)이라고 합니다.&lt;/li&gt;
&lt;li&gt;만약 Total width와 Data width가 동일하다면, 해당 Locator에 위치한 메모리는 에러 감지와 수정을 수행할 수 없습니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1730636828785&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;$ dmidecode -t memory
...
Memory Device
	Array Handle: 0x0042
	Error Information Handle: Not Provided
	Total Width: 72 bits
	Data Width: 64 bits
	Size: 32 GB
	Form Factor: DIMM
	Set: None
	Locator: P2-DIMME1
	Bank Locator: P1_Node1_Channel1_Dimm0
	Type: DDR4&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/ECC_memory&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;ECC(Error-Correction Code) memory&lt;/a&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;CPU가 Memory Controller(이하 MC)로 데이터를 쓸 때, MC는 실시간으로 Hamming code 또는 SECDED+를 사용해서 syndrom이라는 것을 계산하여 데이터의 Total width를 산출합니다.&lt;/li&gt;
&lt;li&gt;MC는 에러를 감지하기 위해 syndrome을 바라봅니다. 여기서 ECC가 에러를 fix할 수 있으면 CE(Correctable Error), 없으면 UE(Uncorrectable Error)가 됩니다.&lt;/li&gt;
&lt;li&gt;CE/UE는 MC의 특정 레지스터들에 저장되며 BIOS나 EDAC 드라이버가 이를 읽어 아래 경로에 카운트합니다.&lt;br /&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;ce_count : CE는 MC에 의해 카운트 된 Correctable Errors입니다. (not fetal)&lt;/li&gt;
&lt;li&gt;ue_count : UE는 Uncorrectable Errors로, 시스템에 auto-correct 할 수 없는 에러를 의미합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1730619486974&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;$ cat /sys/devices/system/edac/mc/mc*/csrow*/*ce_count
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0&lt;/code&gt;&lt;/pre&gt;
&lt;pre id=&quot;code_1730619517324&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;$ cat /sys/devices/system/edac/mc/mc*/csrow*/*ue_count
0
0
0&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;EDAC 커널 모듈 확인
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;아래는 Intel skylake 프로세서의 드라이버입니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1730638678535&quot; style=&quot;background-color: #f8f8f8; color: #383a42;&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;$ lsmod | grep edac
skx_edac			24576	0
nift				77824	1	skx_edac&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;HardwareCorrupted
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;/proc/meminfo의 HardwareCorrupted는 ECC에 의해 오류가 감지된 메모리를 의미합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1730639058240&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;$ cat /proc/meminfo | grep -i hardware
HardwareCorrupted:		0kB
...&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Promethus node-exporter에도 EDAC를 감시하는 기능이 있지만, kubelet이 어떻게 위의 log를 띄웠는지는 좀 더 탐구해봐야겠네요.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;684&quot; data-origin-height=&quot;66&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bW4u89/btsJJZjzNHN/40GxMksZOi5UoFELq7L0lk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bW4u89/btsJJZjzNHN/40GxMksZOi5UoFELq7L0lk/img.png&quot; data-alt=&quot;node-exporter&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bW4u89/btsJJZjzNHN/40GxMksZOi5UoFELq7L0lk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbW4u89%2FbtsJJZjzNHN%2F40GxMksZOi5UoFELq7L0lk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;684&quot; height=&quot;66&quot; data-origin-width=&quot;684&quot; data-origin-height=&quot;66&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;node-exporter&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;결론&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;EDAC는 썩 신뢰할만한 지표는 되지 못하는 것 같네요.&lt;/li&gt;
&lt;li&gt;하드웨어 오류인지는 IPMI,iLO,iDRAC등을 통한 바이오스 정보를 확인하는 것이 좋겠네요.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;Reference&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://bluesmoke.sourceforge.net/&quot;&gt;https://bluesmoke.sourceforge.net/&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1730617822966&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;EDAC Project&quot; data-og-description=&quot;&quot; data-og-host=&quot;bluesmoke.sourceforge.net&quot; data-og-source-url=&quot;https://bluesmoke.sourceforge.net/&quot; data-og-url=&quot;https://bluesmoke.sourceforge.net/&quot; data-og-image=&quot;&quot;&gt;&lt;a href=&quot;https://bluesmoke.sourceforge.net/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://bluesmoke.sourceforge.net/&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url();&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;EDAC Project&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;bluesmoke.sourceforge.net&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-edac&quot;&gt;https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-edac&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://github.com/intel/linux-intel-4.9/blob/master/drivers/edac/skx_edac.c&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://github.com/intel/linux-intel-4.9/blob/master/drivers/edac/skx_edac.c&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1730618500890&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;object&quot; data-og-title=&quot;linux-intel-4.9/drivers/edac/skx_edac.c at master &amp;middot; intel/linux-intel-4.9&quot; data-og-description=&quot;Contribute to intel/linux-intel-4.9 development by creating an account on GitHub.&quot; data-og-host=&quot;github.com&quot; data-og-source-url=&quot;https://github.com/intel/linux-intel-4.9/blob/master/drivers/edac/skx_edac.c&quot; data-og-url=&quot;https://github.com/intel/linux-intel-4.9/blob/master/drivers/edac/skx_edac.c&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/dGOFcM/hyXsZPjmCH/lHK3258kE3m8CTp63Rjj2k/img.png?width=1200&amp;amp;height=600&amp;amp;face=0_0_1200_600,https://scrap.kakaocdn.net/dn/OBu1M/hyXsVF8z3y/3nG01J1CjK4TP6qKFRKgb1/img.png?width=1200&amp;amp;height=600&amp;amp;face=0_0_1200_600&quot;&gt;&lt;a href=&quot;https://github.com/intel/linux-intel-4.9/blob/master/drivers/edac/skx_edac.c&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://github.com/intel/linux-intel-4.9/blob/master/drivers/edac/skx_edac.c&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/dGOFcM/hyXsZPjmCH/lHK3258kE3m8CTp63Rjj2k/img.png?width=1200&amp;amp;height=600&amp;amp;face=0_0_1200_600,https://scrap.kakaocdn.net/dn/OBu1M/hyXsVF8z3y/3nG01J1CjK4TP6qKFRKgb1/img.png?width=1200&amp;amp;height=600&amp;amp;face=0_0_1200_600');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;linux-intel-4.9/drivers/edac/skx_edac.c at master &amp;middot; intel/linux-intel-4.9&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;Contribute to intel/linux-intel-4.9 development by creating an account on GitHub.&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;github.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://unix.stackexchange.com/questions/204286/what-does-mean-by-hardwarecorrupted-directmap4k-directmap2m-fields-in-proc-m&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://unix.stackexchange.com/questions/204286/what-does-mean-by-hardwarecorrupted-directmap4k-directmap2m-fields-in-proc-m&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1730638401676&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;what does mean by HardwareCorrupted, DirectMap4k, DirectMap2M fields in &amp;ldquo;/proc/meminfo&amp;rdquo; file of Linux?&quot; data-og-description=&quot;I am looking for description of following terms: HardwareCorrupted, DirectMap4k, DirectMap2M fields in &amp;quot;/proc/meminfo&amp;quot; file of Linux. I could find the following description for the fields&quot; data-og-host=&quot;unix.stackexchange.com&quot; data-og-source-url=&quot;https://unix.stackexchange.com/questions/204286/what-does-mean-by-hardwarecorrupted-directmap4k-directmap2m-fields-in-proc-m&quot; data-og-url=&quot;https://unix.stackexchange.com/questions/204286/what-does-mean-by-hardwarecorrupted-directmap4k-directmap2m-fields-in-proc-m&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/brXSmY/hyXsSW0cH0/EfJGDnXVK5Ms1jDKBJW5Z1/img.png?width=316&amp;amp;height=316&amp;amp;face=0_0_316_316&quot;&gt;&lt;a href=&quot;https://unix.stackexchange.com/questions/204286/what-does-mean-by-hardwarecorrupted-directmap4k-directmap2m-fields-in-proc-m&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://unix.stackexchange.com/questions/204286/what-does-mean-by-hardwarecorrupted-directmap4k-directmap2m-fields-in-proc-m&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/brXSmY/hyXsSW0cH0/EfJGDnXVK5Ms1jDKBJW5Z1/img.png?width=316&amp;amp;height=316&amp;amp;face=0_0_316_316');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;what does mean by HardwareCorrupted, DirectMap4k, DirectMap2M fields in &amp;ldquo;/proc/meminfo&amp;rdquo; file of Linux?&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;I am looking for description of following terms: HardwareCorrupted, DirectMap4k, DirectMap2M fields in &quot;/proc/meminfo&quot; file of Linux. I could find the following description for the fields&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;unix.stackexchange.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>System Engineering/Kubernetes</category>
      <author>Hopulence</author>
      <guid isPermaLink="true">https://hopulence.tistory.com/47</guid>
      <comments>https://hopulence.tistory.com/47#entry47comment</comments>
      <pubDate>Wed, 25 Sep 2024 10:07:50 +0900</pubDate>
    </item>
    <item>
      <title>iSCSI Error Handling</title>
      <link>https://hopulence.tistory.com/45</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;1. SCSI(Small Computer System Interface)&lt;/b&gt;&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;SCSI란?
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;1980년대에 HDD, 자기테이프 등의 주변장치를 위한 I/O 표준으로 Read, Write, Inquiry 등명령어 집합입니다.&lt;/li&gt;
&lt;li&gt;직렬 인터페이스인 SAS(Serial Attached SCSI)는 SCSI를 사용합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;675&quot; data-origin-height=&quot;626&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/HscMM/btsJeDVb4j0/hp7Z6ztVgHW3O70HybNrK1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/HscMM/btsJeDVb4j0/hp7Z6ztVgHW3O70HybNrK1/img.png&quot; data-alt=&quot;SCSI I/O system and domain model&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/HscMM/btsJeDVb4j0/hp7Z6ztVgHW3O70HybNrK1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FHscMM%2FbtsJeDVb4j0%2Fhp7Z6ztVgHW3O70HybNrK1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;675&quot; height=&quot;626&quot; data-origin-width=&quot;675&quot; data-origin-height=&quot;626&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;SCSI I/O system and domain model&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;SCSI는 명령어를 전송하는 주체인 Initiator와 LUN(Logical Unit)을 제공하는 Target이 클라이언트 서버 구조로 구성됩니다. 두 주체는 명령어를 요청하고 응답을 받습니다.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Logical unit
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;SCSI에서 논리적인 스토리지 단위를 식별하는 번호로 하나의 물리 디스크를 의미합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Device Service/Task&lt;br /&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Read, Write, Inquiry 등의 I/O 요청과 Abort, Reset 등 제어 명령입니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p style=&quot;color: #333333; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;889&quot; data-origin-height=&quot;479&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/v133c/btsJdHDXJZC/iiDwbd74EAZ1HMJx7y6Xik/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/v133c/btsJdHDXJZC/iiDwbd74EAZ1HMJx7y6Xik/img.png&quot; data-alt=&quot;SAM(SCSI Architecture Model) client-server model&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/v133c/btsJdHDXJZC/iiDwbd74EAZ1HMJx7y6Xik/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fv133c%2FbtsJdHDXJZC%2FiiDwbd74EAZ1HMJx7y6Xik%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;889&quot; height=&quot;479&quot; data-origin-width=&quot;889&quot; data-origin-height=&quot;479&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;SAM(SCSI Architecture Model) client-server model&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;2. iSCSI(Internet SCSI)&lt;/b&gt;&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;표준 SCSI 아키텍쳐와 호환되는 TCP 기반 프로토콜로 네트워크 내의 I/O 장치와 통신하기 위해 고안되었습니다. 기존 SCSI의 Initiator - Target 구조를 그대로 유지하면서 네트워크 인프라를 활용합니다.&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;iSCSI initiator(e.g. open-iscsi)는 TCP 세션을 맺은 이후 ID/PW login을 통해 인증/인가를 합니다. 그리고 iSCSI에서 수신 최대 데이터 길이 등을 협상하기 위해 Text를 주고 받습니다. 협상이 완료되면 initiator는 LUN 정보를 수신하고 커널이 LUN을 '/sys/class/scsi_device' 경로에 추가하여 block device로 인식하고 합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;* initiato는 디바이스를 독점하려 사용하므로 k8s에서 직접 사용시 &lt;u&gt;RWX(ReadWriteMany)로 사용하는 것은 불가&lt;/u&gt;합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1313&quot; data-origin-height=&quot;507&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bdt4rO/btsJgbiEUkb/yTin7NmK5zTPihr0HQkKKK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bdt4rO/btsJgbiEUkb/yTin7NmK5zTPihr0HQkKKK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bdt4rO/btsJgbiEUkb/yTin7NmK5zTPihr0HQkKKK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbdt4rO%2FbtsJgbiEUkb%2FyTin7NmK5zTPihr0HQkKKK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1313&quot; height=&quot;507&quot; data-origin-width=&quot;1313&quot; data-origin-height=&quot;507&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;3. SCSI &amp;amp; iSCSI NOP Error handling&lt;/b&gt;&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;SCSI Error handling
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;SCSI 명령어가 요청된 이후 30초의 타이머가 동작합니다. 그리고 최대 5번의 재시도 후에도 실패하면 Abort 요청을 합니다. 그리고 이 요청에도 10초 이상 응답이 없으면 커널에서는 강제로 해당 장치를 offline으로 만듭니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;iSCSI Error handling
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;I/O 요청에 대한 timeout과는 별개로 initiator와 target은 네트워크와 NAS의 문제를 감지하기 위해 주기적으로 ping으로 상태를 확인합니다. 임계치 이상의 Timeout이 발생하면 세션을 바로 종료하지 않고 recovery(replacement) 타이머를 주고 기다려줍니다. 타이머가 만료되어도 복구되지 않으면 세션을 종료하고 주기적으로 target re-login을 시도합니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1345&quot; data-origin-height=&quot;533&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bqJj8G/btsJdF0uE4R/P8EH9sHWzUGscoARlwGh80/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bqJj8G/btsJdF0uE4R/P8EH9sHWzUGscoARlwGh80/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bqJj8G/btsJdF0uE4R/P8EH9sHWzUGscoARlwGh80/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbqJj8G%2FbtsJdF0uE4R%2FP8EH9sHWzUGscoARlwGh80%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1345&quot; height=&quot;533&quot; data-origin-width=&quot;1345&quot; data-origin-height=&quot;533&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&amp;nbsp;SCSI + iSCSI
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;SCSI 명령어가 iSCSI layer로 전송되면 타이머가 시작되고 ISCSI layer로 응답을 받으면 타이머가 멈춥니다.&lt;/li&gt;
&lt;li&gt;명령어가 retry되면 타이머도 리셋되며, 타이머가 만료되면 SCSI layer는 iSCSI layer로 ABORT_TASK 요청을 보냅니다.&lt;/li&gt;
&lt;li&gt;Abort 요청이 성공하면 SCSI layer는 타이머가 만료되었던 명령어를 retry합니다.&lt;/li&gt;
&lt;li&gt;Abort 요청이 timeout된다면 iSCSI layer에서 SCSI layer로 failure report 이후 error return(ISCSI_ERR_SCSI_EH_SESSION_RST)하여 TCP RST를 보냅니다. 그리고 replacement_timeout 타이머가 시작됩니다.&lt;/li&gt;
&lt;li&gt;replacement_timeout이 만료되면 target으로 re-login할 때 까지 SCSI device를 offline상태로 둡니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;4. 데몬 &amp;amp; 커널 설정 값&lt;/b&gt;&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;open-iscsi ping/NOP-Out 설정
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;replancement는 단일 LUN에 문제가 생긴 경우입니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1718532294943&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# cat /etc/iscsid.conf
node.conn[0].timeo.noop_out_interval = 10
node.conn[0].timeo.noop_out_timeout = 15
...
node.session.timeo.replacement_timeout = 120
node.session.recovery_tmo = 120&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;SCSI Timeout &amp;amp; max_retries&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1718531864343&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# cat /sys/block/sdx/device/timeout
30

# cat /sys/block/sdci/device/scsi_disk/15\:0\:0:105/max_retries
5&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;eh_deadline &amp;amp; eh_timeout&lt;/li&gt;
&lt;/ul&gt;
&lt;pre id=&quot;code_1718532139626&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# cat /sys/class/scsi_device/15\:0\:0:105/device/eh_timeout
10

# cat /sys/class/scsi_host/host0/eh_deadline
off&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- eh_timeout&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;SCSI ABORT_TASK 명령어에 대한 timeout value입니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- eh_deadline (Default = off)&lt;/p&gt;
&lt;p style=&quot;color: #333333; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;error recovery에 대한 타이머 입니다. SCSI 명령어 timeout 이후 해당 시간이 초과하면 HBA(Host Bus Adapter)를 reset하고 device를 offline 상태로 변경합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;iSCSI와 multipathd&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;multipathd는 DM(Device Mapper)-Multupath I/O를 제공하는 데몬으로 Host 측에서 LUN이나 네트워크 경로에 따라 deivce를 mapping하고 블록 I/O HA와 부하 분산을 제공합니다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;250&quot; data-origin-height=&quot;366&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bGgUtR/btsOUXtZ9u8/RDACxpHFcapuxPSue1K7z0/img.jpg&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bGgUtR/btsOUXtZ9u8/RDACxpHFcapuxPSue1K7z0/img.jpg&quot; data-alt=&quot;Simple multipath example&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bGgUtR/btsOUXtZ9u8/RDACxpHFcapuxPSue1K7z0/img.jpg&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbGgUtR%2FbtsOUXtZ9u8%2FRDACxpHFcapuxPSue1K7z0%2Fimg.jpg&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;250&quot; height=&quot;366&quot; data-origin-width=&quot;250&quot; data-origin-height=&quot;366&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;Simple multipath example&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;요즘의 SDS는 자체적으로 H/A나 부하 분산을 제공하므로 굳이 multipathd 설정까지 필요하지 않을 듯합니다. 그러나 설정이 필요하다면 SCSI 명령어에 대한 타이머나 이중화 정책이 중복되므로 수정이 필요합니다.&lt;/p&gt;
&lt;pre id=&quot;code_1751092804971&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# /etc/multipath.conf

fast_io_fail_tmo 5
polling_interval 10
no_path_retry fail
failback immediate&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;Reference&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc3720&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://datatracker.ietf.org/doc/html/rfc3720&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1718531228534&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;RFC 3720: Internet Small Computer Systems Interface (iSCSI)&quot; data-og-description=&quot;This document describes a transport protocol for Internet Small Computer Systems Interface (iSCSI) that works on top of TCP. The iSCSI protocol aims to be fully compliant with the standardized SCSI architecture model. SCSI is a popular family of protocols &quot; data-og-host=&quot;datatracker.ietf.org&quot; data-og-source-url=&quot;https://datatracker.ietf.org/doc/html/rfc3720&quot; data-og-url=&quot;https://datatracker.ietf.org/doc/html/rfc3720&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/buaRh3/hyWlocMZsH/DvjC6AncSBvhv5h2kjDYI0/img.png?width=1200&amp;amp;height=630&amp;amp;face=0_0_1200_630&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc3720&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://datatracker.ietf.org/doc/html/rfc3720&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/buaRh3/hyWlocMZsH/DvjC6AncSBvhv5h2kjDYI0/img.png?width=1200&amp;amp;height=630&amp;amp;face=0_0_1200_630');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;RFC 3720: Internet Small Computer Systems Interface (iSCSI)&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;This document describes a transport protocol for Internet Small Computer Systems Interface (iSCSI) that works on top of TCP. The iSCSI protocol aims to be fully compliant with the standardized SCSI architecture model. SCSI is a popular family of protocols&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;datatracker.ietf.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://github.com/open-iscsi/open-iscsi&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://github.com/open-iscsi/open-iscsi&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1718531224907&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;object&quot; data-og-title=&quot;GitHub - open-iscsi/open-iscsi: iSCSI tools for Linux&quot; data-og-description=&quot;iSCSI tools for Linux. Contribute to open-iscsi/open-iscsi development by creating an account on GitHub.&quot; data-og-host=&quot;github.com&quot; data-og-source-url=&quot;https://github.com/open-iscsi/open-iscsi&quot; data-og-url=&quot;https://github.com/open-iscsi/open-iscsi&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/bU7d3i/hyWoLD93Ao/CnkkkxISh3kEj3vDFQawM1/img.png?width=1200&amp;amp;height=600&amp;amp;face=0_0_1200_600&quot;&gt;&lt;a href=&quot;https://github.com/open-iscsi/open-iscsi&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://github.com/open-iscsi/open-iscsi&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/bU7d3i/hyWoLD93Ao/CnkkkxISh3kEj3vDFQawM1/img.png?width=1200&amp;amp;height=600&amp;amp;face=0_0_1200_600');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;GitHub - open-iscsi/open-iscsi: iSCSI tools for Linux&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;iSCSI tools for Linux. Contribute to open-iscsi/open-iscsi development by creating an account on GitHub.&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;github.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://www.cs.cmu.edu/afs/club/usr/jhutz/project/Archives/scsi/sam3r13.pdf&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://www.cs.cmu.edu/afs/club/usr/jhutz/project/Archives/scsi/sam3r13.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://kubernetes-csi.github.io/docs/&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://kubernetes-csi.github.io/docs/&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1724587380404&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;Introduction - Kubernetes CSI Developer Documentation&quot; data-og-description=&quot;This site documents how to develop, deploy, and test a Container Storage Interface (CSI) driver on Kubernetes. The Container Storage Interface (CSI) is a standard for exposing arbitrary block and file storage systems to containerized workloads on Container&quot; data-og-host=&quot;kubernetes-csi.github.io&quot; data-og-source-url=&quot;https://kubernetes-csi.github.io/docs/&quot; data-og-url=&quot;https://kubernetes-csi.github.io/docs/&quot; data-og-image=&quot;&quot;&gt;&lt;a href=&quot;https://kubernetes-csi.github.io/docs/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://kubernetes-csi.github.io/docs/&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url();&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Introduction - Kubernetes CSI Developer Documentation&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;This site documents how to develop, deploy, and test a Container Storage Interface (CSI) driver on Kubernetes. The Container Storage Interface (CSI) is a standard for exposing arbitrary block and file storage systems to containerized workloads on Container&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;kubernetes-csi.github.io&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://github.com/kubernetes/design-proposals-archive/blob/main/storage/container-storage-interface.md&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://github.com/kubernetes/design-proposals-archive/blob/main/storage/container-storage-interface.md&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1724587392884&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;object&quot; data-og-title=&quot;design-proposals-archive/storage/container-storage-interface.md at main &amp;middot; kubernetes/design-proposals-archive&quot; data-og-description=&quot;Archive of Kubernetes Design Proposals. Contribute to kubernetes/design-proposals-archive development by creating an account on GitHub.&quot; data-og-host=&quot;github.com&quot; data-og-source-url=&quot;https://github.com/kubernetes/design-proposals-archive/blob/main/storage/container-storage-interface.md&quot; data-og-url=&quot;https://github.com/kubernetes/design-proposals-archive/blob/main/storage/container-storage-interface.md&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/ThD0j/hyWV4XAlvv/UKW1J6mIWIoakkVtkO7anK/img.png?width=1200&amp;amp;height=600&amp;amp;face=0_0_1200_600,https://scrap.kakaocdn.net/dn/c9N3IN/hyWSbc0xwr/3njJQ7DVwiUyC3bS1YiEu0/img.png?width=960&amp;amp;height=540&amp;amp;face=0_0_960_540&quot;&gt;&lt;a href=&quot;https://github.com/kubernetes/design-proposals-archive/blob/main/storage/container-storage-interface.md&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://github.com/kubernetes/design-proposals-archive/blob/main/storage/container-storage-interface.md&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/ThD0j/hyWV4XAlvv/UKW1J6mIWIoakkVtkO7anK/img.png?width=1200&amp;amp;height=600&amp;amp;face=0_0_1200_600,https://scrap.kakaocdn.net/dn/c9N3IN/hyWSbc0xwr/3njJQ7DVwiUyC3bS1YiEu0/img.png?width=960&amp;amp;height=540&amp;amp;face=0_0_960_540');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;design-proposals-archive/storage/container-storage-interface.md at main &amp;middot; kubernetes/design-proposals-archive&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;Archive of Kubernetes Design Proposals. Contribute to kubernetes/design-proposals-archive development by creating an account on GitHub.&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;github.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://github.com/container-storage-interface/spec/blob/master/spec.md&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://github.com/container-storage-interface/spec/blob/master/spec.md&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1724587405886&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;object&quot; data-og-title=&quot;spec/spec.md at master &amp;middot; container-storage-interface/spec&quot; data-og-description=&quot;Container Storage Interface (CSI) Specification. Contribute to container-storage-interface/spec development by creating an account on GitHub.&quot; data-og-host=&quot;github.com&quot; data-og-source-url=&quot;https://github.com/container-storage-interface/spec/blob/master/spec.md&quot; data-og-url=&quot;https://github.com/container-storage-interface/spec/blob/master/spec.md&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/xZjII/hyWSipDZG9/sN4VY7hgqmgRk982Jfo33K/img.png?width=1200&amp;amp;height=600&amp;amp;face=0_0_1200_600&quot;&gt;&lt;a href=&quot;https://github.com/container-storage-interface/spec/blob/master/spec.md&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://github.com/container-storage-interface/spec/blob/master/spec.md&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/xZjII/hyWSipDZG9/sN4VY7hgqmgRk982Jfo33K/img.png?width=1200&amp;amp;height=600&amp;amp;face=0_0_1200_600');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;spec/spec.md at master &amp;middot; container-storage-interface/spec&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;Container Storage Interface (CSI) Specification. Contribute to container-storage-interface/spec development by creating an account on GitHub.&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;github.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Linux_DM_Multipath&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://en.wikipedia.org/wiki/Linux_DM_Multipath&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1751092135944&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;Linux DM Multipath - Wikipedia&quot; data-og-description=&quot;From Wikipedia, the free encyclopedia I/O fail-over and load-balancing Device Mapper Multipath Input Output (DM-MPIO) or DM-Multipathing provides input/output (I/O) fail-over and load-balancing by using multipath I/O within Linux for block devices.[1][2][3&quot; data-og-host=&quot;en.wikipedia.org&quot; data-og-source-url=&quot;https://en.wikipedia.org/wiki/Linux_DM_Multipath&quot; data-og-url=&quot;https://en.wikipedia.org/wiki/Linux_DM_Multipath&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/czpDEK/hyZfZGlAda/LjU7USkQqCKR3jKBiWKMq0/img.jpg?width=250&amp;amp;height=366&amp;amp;face=0_0_250_366&quot;&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Linux_DM_Multipath&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://en.wikipedia.org/wiki/Linux_DM_Multipath&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/czpDEK/hyZfZGlAda/LjU7USkQqCKR3jKBiWKMq0/img.jpg?width=250&amp;amp;height=366&amp;amp;face=0_0_250_366');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Linux DM Multipath - Wikipedia&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;From Wikipedia, the free encyclopedia I/O fail-over and load-balancing Device Mapper Multipath Input Output (DM-MPIO) or DM-Multipathing provides input/output (I/O) fail-over and load-balancing by using multipath I/O within Linux for block devices.[1][2][3&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;en.wikipedia.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>System Engineering/Linux</category>
      <author>Hopulence</author>
      <guid isPermaLink="true">https://hopulence.tistory.com/45</guid>
      <comments>https://hopulence.tistory.com/45#entry45comment</comments>
      <pubDate>Sun, 16 Jun 2024 19:07:30 +0900</pubDate>
    </item>
    <item>
      <title>[커널이야기] 리눅스 I/O 스케쥴러</title>
      <link>https://hopulence.tistory.com/43</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;드디어 어려워서 계속 미뤄왔던 남은 마지막 챕터를 정리하네요. 이 포스팅은 아래 책을 정리하며 공부한 내용입니다. 좋은 책을 출판해주신 저자님께 감사드립니다  &lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;http://www.yes24.com/Product/Goods/44376723&quot;&gt;http://www.yes24.com/Product/Goods/44376723&lt;/a&gt;&lt;span style=&quot;background-color: #1e1f21; color: #e6e6e9; text-align: left;&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1705330566739&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;book&quot; data-og-title=&quot;DevOps와 SE를 위한 리눅스 커널 이야기 - 예스24&quot; data-og-description=&quot;커널은 오랜 세월 기능이 추가되고 개선되어 오면서 완벽하게 이해하기 힘들 정도로 방대해졌다. 하지만 변하지 않는 기본 기능들이 있다. 이런 근간이 되는 기능에 대한 이해를 바탕으로 시스&quot; data-og-host=&quot;www.yes24.com&quot; data-og-source-url=&quot;http://www.yes24.com/Product/Goods/44376723&quot; data-og-url=&quot;https://www.yes24.com/Product/Goods/44376723&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/ZFBnl/hyU5ICKg95/YmwL2eCVX52sMaxM98wAsK/img.jpg?width=914&amp;amp;height=1200&amp;amp;face=0_0_914_1200,https://scrap.kakaocdn.net/dn/bJRQ3C/hyU5TEfCvK/RHw6U3pGNLL9b8Jqg0TkJ1/img.jpg?width=914&amp;amp;height=1200&amp;amp;face=0_0_914_1200,https://scrap.kakaocdn.net/dn/bE6xmy/hyU5KN5nKQ/KSZoGnxbYs0lDosJ0Yad9k/img.jpg?width=914&amp;amp;height=1200&amp;amp;face=0_0_914_1200&quot;&gt;&lt;a href=&quot;http://www.yes24.com/Product/Goods/44376723&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;http://www.yes24.com/Product/Goods/44376723&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/ZFBnl/hyU5ICKg95/YmwL2eCVX52sMaxM98wAsK/img.jpg?width=914&amp;amp;height=1200&amp;amp;face=0_0_914_1200,https://scrap.kakaocdn.net/dn/bJRQ3C/hyU5TEfCvK/RHw6U3pGNLL9b8Jqg0TkJ1/img.jpg?width=914&amp;amp;height=1200&amp;amp;face=0_0_914_1200,https://scrap.kakaocdn.net/dn/bE6xmy/hyU5KN5nKQ/KSZoGnxbYs0lDosJ0Yad9k/img.jpg?width=914&amp;amp;height=1200&amp;amp;face=0_0_914_1200');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;DevOps와 SE를 위한 리눅스 커널 이야기 - 예스24&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;커널은 오랜 세월 기능이 추가되고 개선되어 오면서 완벽하게 이해하기 힘들 정도로 방대해졌다. 하지만 변하지 않는 기본 기능들이 있다. 이런 근간이 되는 기능에 대한 이해를 바탕으로 시스&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;www.yes24.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;목차&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- I/O 스케줄러의 필요성과 역할&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- I/O 스케줄러와 파라미터 튜닝&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp; 1) Non-Multiqueue 스케줄러&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;gt; CFQ&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;gt; Deadline&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;gt; Noop&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- Miltiqueue의 배경&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp; 3) Multiqueue 스케줄러&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;gt; MQ-deadline&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;gt; None&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;gt; BFQ&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;gt; Kyber&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;시스템에서 I/O 작업이 발생하면 가상 파일 시스템과 로컬 파일 시스템을 거쳐 I/O 스케줄러에 병합과 정렬을 통해 queueing 되고 순차적으로 블록 디바이스에 전달됩니다. 여기서 서버의 워크로드와 I/O 스케줄러의 알고리즘에 따라 성능이 달라집니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1028&quot; data-origin-height=&quot;990&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/baq9cq/btsCMrf1wnQ/6JmFSMFLr1nfpbGsGDZE41/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/baq9cq/btsCMrf1wnQ/6JmFSMFLr1nfpbGsGDZE41/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/baq9cq/btsCMrf1wnQ/6JmFSMFLr1nfpbGsGDZE41/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbaq9cq%2FbtsCMrf1wnQ%2F6JmFSMFLr1nfpbGsGDZE41%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;726&quot; height=&quot;699&quot; data-origin-width=&quot;1028&quot; data-origin-height=&quot;990&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;I/O 스케줄러의 필요성과 역할&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;플래시 메모리를 기반으로 하는 SSD와 달리 HDD는 헤드의 물리적인 이동에 따른 지연이 발생하며, 한 번 디스크 섹터를 이동할 때 I/O 요청을 최대한 처리해야 성능을 극대화할 수 있습니다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;3058&quot; data-origin-height=&quot;1244&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cnm8qg/btsC6nwvUiE/EBp1pUuBTjmHFFJQifuUUK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cnm8qg/btsC6nwvUiE/EBp1pUuBTjmHFFJQifuUUK/img.png&quot; data-alt=&quot;HDD와 SSD 구조&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cnm8qg/btsC6nwvUiE/EBp1pUuBTjmHFFJQifuUUK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fcnm8qg%2FbtsC6nwvUiE%2FEBp1pUuBTjmHFFJQifuUUK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;796&quot; height=&quot;324&quot; data-origin-width=&quot;3058&quot; data-origin-height=&quot;1244&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;HDD와 SSD 구조&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;아래 예시처럼 3개의 I/O 요청을 하나로 병합해서 디스크 헤드의 동선을 최적화할 수 있습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;I/O 요청을 병합하고 정렬하지 않은 경우 I/O 요청마다 헤드가 각각 이동하여 데이터를 처리해야합니다. 반면, 요청을 병합하고 정렬한 경우 더 적은 헤드의 이동으로 모든 데이터를 처리할 수 있습니다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;3494&quot; data-origin-height=&quot;1718&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/rP7cV/btsC2zx1sn0/FXdkcQ8N427Hu9qFH5k6dK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/rP7cV/btsC2zx1sn0/FXdkcQ8N427Hu9qFH5k6dK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/rP7cV/btsC2zx1sn0/FXdkcQ8N427Hu9qFH5k6dK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FrP7cV%2FbtsC2zx1sn0%2FFXdkcQ8N427Hu9qFH5k6dK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;3494&quot; height=&quot;1718&quot; data-origin-width=&quot;3494&quot; data-origin-height=&quot;1718&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;* 디스크 타입 확인 방법&lt;/p&gt;
&lt;pre id=&quot;code_1702358085413&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;root@master-1:~ # lsblk -d -o name,rota
NAME  ROTA
sda      0
sdb      0
sdc      1

# ROTA 1 &amp;rarr; HHD
#      0 &amp;rarr; SSD&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;&amp;nbsp;I/O 스케줄러 종류와 설정 방법&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;스케줄러의 방식에 따라 non-multiqueue와 multiqueue 방식 2가지가 있습니다.&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;non-multiqueue 스케줄러는 단일 FIFO 큐로 구현되어 있는 반면 multiqueue 스케줄러는 I/O 요청을 저장 장치의 여러 queue에 동시에 매핑하여 병렬 처리를 합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;아래 명령어 처럼 디스크 별로 스케줄러를 변경할 수 있습니다.&lt;/p&gt;
&lt;pre id=&quot;code_1704440544507&quot; class=&quot;bash&quot; style=&quot;background-color: #f8f8f8; color: #383a42; text-align: start;&quot; data-ke-type=&quot;codeblock&quot; data-ke-language=&quot;bash&quot;&gt;&lt;code&gt;# Non-multiqueue scheduler (kernel 4.x 이전)
root@compute-1-1:~# cat /sys/block/sda/queue/scheduler
noop deadline [cfq]

root@compute-1-1:~# echo deadline &amp;gt; /sys/block/sda/queue/scheduler
root@compute-1-1:~# cat /sys/block/sda/queue/scheduler
noop [deadline] cfq&lt;/code&gt;&lt;/pre&gt;
&lt;p style=&quot;position: absolute;&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1705331927160&quot; class=&quot;bash&quot; style=&quot;background-color: #f8f8f8; color: #383a42;&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# Multiqueue scheduler (kernel 5.x 이후)
master-2:# cat /sys/block/sda/queue/scheduler
[mq-deadline] kyber bfq none&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;&amp;nbsp; 1. cfq I/O 스케줄러 (Completely Fair Queueing) [ non-multiqueue ]&lt;/b&gt;&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;cfq 스케줄러는 프로세스마다 각각의 queue를 가지며, 디스크의 대역폭에 따라 일정한 slice로 나누어 각 프로세스의 I/O 요청을 공정하게 처리합니다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;970&quot; data-origin-height=&quot;686&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dk5UPz/btsDd35EcP5/OgbuxWJSKk15clo4zHgedK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dk5UPz/btsDd35EcP5/OgbuxWJSKk15clo4zHgedK/img.png&quot; data-alt=&quot;CFQ I/O scheduler architecture 1&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dk5UPz/btsDd35EcP5/OgbuxWJSKk15clo4zHgedK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fdk5UPz%2FbtsDd35EcP5%2FOgbuxWJSKk15clo4zHgedK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;653&quot; height=&quot;462&quot; data-origin-width=&quot;970&quot; data-origin-height=&quot;686&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;CFQ I/O scheduler architecture 1&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;좀 더 세부적으로 보겠습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;프로세스의 I/O 요청은 Block layer를 거쳐서 cfq로 인입되면 우선순위에 따라 &lt;b&gt;RT(Real Time)&lt;/b&gt;, &lt;b&gt;BE(Best Effort)&lt;/b&gt;, &lt;b&gt;IDLE&lt;/b&gt;로 나눠집니다. RT는 BE와 IDLE보다 항상 먼저 처리되므로 RT에 해당하는 I/O가 많으면 후순위의 요청에서 기아 상태가 발생할 수 있습니다. RT와 BE는 각 클래스에서 0~7까지의 하위 우선순위로 나누어집니다. IDLE 클래스는 다른 우선순위 클래스에 대기중인 I/O 요청이 없을 때 처리됩니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;우선순위 클래스가 분류된 이후 service tree라고 불리는 워크로드별 그룹(SYNC, SYNC_NOIDLE, ASYNC)로 다시 나눠집니다. 이 후&amp;nbsp; I/O 작업을 요청한 프로세스에 따라 요청이 인입될 cfq queue가 결정되고, 각각 동등한 time slice를 할당하여 round-robin으로 처리됩니다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1045&quot; data-origin-height=&quot;598&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b3il4I/btsBQnSfL1X/NXLSsKOSiw6D3h7ZrHWXZk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b3il4I/btsBQnSfL1X/NXLSsKOSiw6D3h7ZrHWXZk/img.png&quot; data-alt=&quot;CFQ I/O scheduler architecture 2&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b3il4I/btsBQnSfL1X/NXLSsKOSiw6D3h7ZrHWXZk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb3il4I%2FbtsBQnSfL1X%2FNXLSsKOSiw6D3h7ZrHWXZk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1045&quot; height=&quot;598&quot; data-origin-width=&quot;1045&quot; data-origin-height=&quot;598&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;CFQ I/O scheduler architecture 2&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;SYNC&lt;/b&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;: 순차적 동기화 I/O 작업으로&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;b&gt;순차 읽기(Sequential read)&lt;/b&gt; 또는 &lt;b&gt;Direct write&lt;/b&gt;을 의미합니다.&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;u&gt;순차 작업이므로 이후에 발생할 I/O 요청이 이전의 디스크 섹터와 인접할 가능성이 큽니다&lt;/u&gt;. 즉,&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;u&gt;큐의 작업을 처리한 후 일정 시간동안 대기하여 현재 헤드의 위치에서 인접한 섹터의 I/O 요청을 기다렸다가 처리&lt;/u&gt;합니다. 예를 들어 어플리케이션이 config 파일을 읽는다면, 전체 파일을 다 읽어야 정상적으로 실행될 수 있습니다. 그래서 읽기 작업이 완료될 때 까지 어플리케이션을 블록하는 동기 작업으로 수행됩니다.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;SYNC_NOIDLE&lt;/b&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;: 임의적 동기화 I/O작업으로&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;b&gt;임의 읽기(Randome read)&lt;/b&gt;를 의미합니다. 임의 작업은 다음 I/O 요청의 디스크 섹터를 예측할 수 없기 때문에 대기 시간없이 바로 헤드를 이동합니다.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;ASYNC&lt;/b&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;: 비동기 I/O 작업으로&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;b&gt;쓰기 작업(Buffered write)&lt;/b&gt;을 의미합니다. 각 프로세스에서 발생한 쓰기 작업은 여기에 모여서 한 번에 처리됩니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;* 대부분의 요청은 BE에 해당하며 ionice 명령어로 우선 순위 설정을 변경할 수 있습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&amp;nbsp;* cfq 스케줄러의 파라미터&lt;/b&gt;&lt;/p&gt;
&lt;pre id=&quot;code_1704710672221&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;root@compute-1-1:~# ls /sys/block/sda/queue/iosched
back_seek_max &amp;nbsp; &amp;nbsp; &amp;nbsp;group_idle &amp;nbsp; &amp;nbsp; slice_async &amp;nbsp; &amp;nbsp; slice_idle_us &amp;nbsp; target_latency_us
back_seek_penalty &amp;nbsp;group_idle_us &amp;nbsp;slice_async_rq &amp;nbsp;slice_sync
fifo_expire_async &amp;nbsp;low_latency &amp;nbsp; &amp;nbsp;slice_async_us &amp;nbsp;slice_sync_us
fifo_expire_sync &amp;nbsp; quantum &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;slice_idle &amp;nbsp; &amp;nbsp; &amp;nbsp;target_latency&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;b&gt;back_seek_max&lt;/b&gt; : 디스크 헤드의 위치를 기준으로 backward seeking의 최대 거리(Kbytes)를 의미합니다. 현재 헤드보다 뒤쪽 섹터로 back_seek_max 만큼을 다음 요청으로 예상하여 처리합니다. 순차 쓰기가 잦은 경우 이 값을 줄여 헤드의 움직임을 최소화할 수 있지만, 그 만큼 다른 요청이 지연될 수 있습니다. (Default 16384 = 16KB)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;back_seek_penalty &lt;/b&gt;: backward seeking의 패널티 값입니다. 일반적으로 디스크 헤드가 이동할 때 1 &amp;rarr;&amp;nbsp; 2 &amp;rarr;&amp;nbsp; 3 &amp;rarr;&amp;nbsp; 4로 이동하는 것이&amp;nbsp; 1 &amp;rarr; 3&amp;nbsp; &amp;rarr; 2 &amp;rarr; 4로 이동하는 것 보다 효율적입니다. 그래서 backward seeking에 대해 패널티를 주는 것입니다. 현재 헤드 위치가 1,024KB인 상황에서 1,008KB과 1,040KB에 대한 요청이 동시에 발생한 경우 두 요청에 대한 distance는 16KB로 동일합니다. 여기서 backward seeking에 패널티(Default 2)를 적용하여 32KB가 되고 헤드는 forward seeking을 수행하게 됩니다.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;fifo_expire_async&lt;/b&gt; : CFQ의 FIFO 리스트에서 async(쓰기 작업) 요청의 만료 시간을 정의합니다. (Default 250ms)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;fifo_expire_sync &lt;/b&gt;: CFQ의 FIFO 리스트에서 sync(읽기 작업) 요청의 만료 시간을 정의합니다. (Default 125ms)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;group_idle&lt;/b&gt; : 일반적으로 동일 cgroup의 프로세스 큐로 이동할 때에는 대기시간 없이 다음 큐로 넘어갑니다. 그러나 다른 cgroup의 프로세스 큐로 이동할 때는 추가로 발생할 I/O요청을 대비하기 위해 group_idle의 값만큼 대기했다가 다른 프로세스 큐로 이동합니다.&amp;nbsp; (Default 8ms)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;quantum &lt;/b&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;: SYNC(순차 읽기) 요청을 dispatch queue로 넘겨주는 최대 버퍼값입니다. 요청이 해당 값을 초과해야 큐로 넘겨줍니다. 디스크가 여러 장인 시스템의 경우 I/O 병렬처리가 가능하므로, quantum 값을 늘려서 큐에서 한 번에 꺼낼 수 있는 요청의 개수 늘려 I/O 성능을 개선할 수 있습니다. 하지만 값이 커진 만큼 하나의 큐가 실행될 때 걸리는 시간이 늘어나므로() 일부 I/O에서 성능이 저하될 수 있습니다.&amp;nbsp; slice_async_rq와 함께 고려하여 설정해야 합니다. (Default 8)&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;low_latency&lt;/b&gt; : low latency mode를 enable/dsiable 합니다. &lt;u&gt;enable인 경우&lt;/u&gt; 큐의 예상 처리 시간 target_latency 값보다 큰 경우 time_slice를 조정합니다. 각 큐에 time slice를 할당하기 전에 각 그룹 별(RT, BE, IDLE)로 요청 개수를 확인하고 해당 그룹의 큐를 모두 처리하는데 걸리는 시간인 expect_latency(=그룹 별 요청의 총합 * time slice)를 계산합니다. 이 계산 결과가 target_latency 값보다 크다면 이 값을 넘지않도록 time slice를 조정합니다.&amp;nbsp; (Default 1 = enable)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;target_latency&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/b&gt;: low latency mode에서 time slice를 조절하기 위해 사용되는 예상 지연 시간입니다.&amp;nbsp; (Default 300ms)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;slice_async &lt;/b&gt;: ASYNC 큐의 time slice를 계산하는 기준 값입니다. (Default 40ms)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;slice_async_rq&lt;/b&gt;&lt;span&gt;&amp;nbsp;: 1번의 time slice에서 dispatch 할 수 있는&amp;nbsp; ASYNC 요청의 최대 개수입니다. (Default 2)&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;slice_idle&lt;/b&gt; :&lt;span&gt; time slice 안에 큐의 모든 I/O 요청을 처리했을 때 다음 큐로 넘어가지 않고 대기하는 시간입니다. 대부분의 I/O 요청은 random access보다 sequential access가 많으므로 대기 시간을 주어 디스크 헤드의 이동을 최소화합니다. &lt;u&gt;SSD의 경우 0&lt;/u&gt;, &lt;u&gt;SAS/SATA의 경우 non-zero&lt;/u&gt;로 설정을 권고합니다.&amp;nbsp;(Default 커널 버전에 따라 0 또는 8ms)&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;slice_sync&amp;nbsp;&lt;/b&gt;: SYNC 큐의 time slice를 계산하는 기준 값입니다. (Default 100ms)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;s&gt;group_isolation&lt;span&gt;&amp;nbsp;&lt;/span&gt;(Deprecated)&lt;/s&gt;&lt;/b&gt;&lt;s&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;:&lt;span&gt;&amp;nbsp;&lt;/span&gt;이 값이 0이면 SYNC_NOIDLE(임의 접근)에 속하는 큐들을 root cgroup으로 이동하여 처리합니다.&amp;nbsp;&lt;/s&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;* 끝에 _us가 붙은 파라미터들은 ms대신 us로 설정하기 위한 파라미터 입니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;Ex) slice_sync(100ms) &amp;harr; slice_sync_us(100000us)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;&amp;nbsp; 2. dealine I/O 스케줄러 [ non-miltiqueue ]&lt;/b&gt;&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;dealine 스케줄러는 I/O 요청별로 완료되어야 하는 deadline을 가지고 있으며 리눅스의 starvation problem에 대해 가장 빠른 스케줄러입니다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;962&quot; data-origin-height=&quot;680&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dnOicQ/btsC6Ua9slH/Ad5HkllwD5ekFtK1Xdc44k/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dnOicQ/btsC6Ua9slH/Ad5HkllwD5ekFtK1Xdc44k/img.png&quot; data-alt=&quot;Deadline I/O scheduler architecture 1&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dnOicQ/btsC6Ua9slH/Ad5HkllwD5ekFtK1Xdc44k/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdnOicQ%2FbtsC6Ua9slH%2FAd5HkllwD5ekFtK1Xdc44k%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;690&quot; height=&quot;488&quot; data-origin-width=&quot;962&quot; data-origin-height=&quot;680&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;Deadline I/O scheduler architecture 1&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;deadline 스케줄러에는 2개의 sotred list와 fifo list가 있습니다. sorted list는 각각 읽기 요청과 쓰기 요청을 저장하고 있으며 디스크 섹터를 기준으로 정렬됩니다. fifo list도 각각 읽기 요청과 쓰기 요청을 저장하지만 요청이 발생한 시간에 따른 deadline을 기준으로 정렬됩니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;deadline 스케줄러에서 처리되는 I/O 요청이 deadline을 넘지 않는다면 sorted list의 순서대로 처리됩니다. 그러나 deadline을 초과한 I/O 요청이 발생하면 해당 요청을 먼저 처리합니다. 예를 들어, sorted list에 각 요청이 섹터 1, 10, 30, 50, 70에 해당하는 순서대로 정렬되어 있다고 가정하겠습니다. 이대로 deadline을 넘는 요청이 없다면 1&amp;rarr;10&amp;rarr;30&amp;rarr;50&amp;rarr;70의 순서대로 처리될 것입니다. 여기서 10번 블록의 요청을 처리한 후 50번 블록에 대한 요청에서 timeout이 발생했다면 해당 요청을 먼저 처리하기 위해 30&amp;rarr;50&amp;rarr;70의 순서가 아닌 50&amp;rarr;30&amp;rarr;70의 순서대로 재정렬됩니다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1238&quot; data-origin-height=&quot;905&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/n4yn0/btsDxTVGPxe/zb82wVOoYkcP9k5Xi3Kg50/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/n4yn0/btsDxTVGPxe/zb82wVOoYkcP9k5Xi3Kg50/img.png&quot; data-alt=&quot;Deadline I/O scheduler architecture 2&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/n4yn0/btsDxTVGPxe/zb82wVOoYkcP9k5Xi3Kg50/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fn4yn0%2FbtsDxTVGPxe%2Fzb82wVOoYkcP9k5Xi3Kg50%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;682&quot; height=&quot;499&quot; data-origin-width=&quot;1238&quot; data-origin-height=&quot;905&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;Deadline I/O scheduler architecture 2&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&amp;nbsp;* deadline 스케줄러의 파라미터&lt;/b&gt;&lt;/p&gt;
&lt;pre id=&quot;code_1704716460393&quot; class=&quot;bash&quot; style=&quot;background-color: #f8f8f8; color: #383a42; text-align: start;&quot; data-ke-type=&quot;codeblock&quot; data-ke-language=&quot;bash&quot;&gt;&lt;code&gt;root@compute-1-1:~# ls /sys/block/sda/queue/iosched/
fifo_batch  front_merges  read_expire  write_expire  writes_starved&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;b&gt;fifo_batch&lt;/b&gt; : 한 번의 batch로 dispatch queue에서 실행할 I/O 요청의 개수를 의미합니다. 값이 낮을 수록 batch 당 처리량이 적으므로 latency가 적으며, 반대로 값이 크면 throughput이 증가합니다. (Default 16)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;front_merges&lt;/b&gt; : 디스크 헤드가 현재 섹터보다 앞쪽을 탐색하는 동작을 허용할 것인지 결정합니다. 시스템이 순차 읽기를 주로 수행한다면 값을 0으로 하여 앞쪽 탐색에 대한 오버헤드를 줄일 수 있습니다. (Default 1)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;read_expire&lt;/b&gt; : 읽기 요청에 대한 만료 시간입니다. fifo list에 인입된 이후 해당 시간동안 처리되지 않으면 sorted list에서 재정렬됩니다. (Default 500ms)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;write_expire&lt;/b&gt; : 쓰기 요청에 대한 만료 시간입니다. (Default&amp;nbsp; 5000ms)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;writes_starved&lt;/b&gt; : 하나의 쓰기 batch를 작업하기 전에 몇 개의 읽기 batch를 작업을 상한으로 제한할 것인 결정합니다. deadline 스케줄러는 읽기 요청을 더 선호하기 때문에 쓰기 요청이 지연될 수 있습니다. 따라서 이 값을 수정해서 읽기와 쓰기 작업에 대한 밸런스를 조정할 수 있습니다. (Default 2)&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;* CFQ vs Deadline&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 웹 서버&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;일반적으로 웹 서버에서 많이 발생하는&amp;nbsp; I/O 는 &lt;u&gt;로그&lt;/u&gt;입니다. 로그는 파일의 끝에 계속 붙여 나가므로 순차 접근이 많으며, 파일 디스크립터를 여러 개 열어서 사용하지 않으므로 I/O 요청 자체도 많지 않습니다. 따라서 cfq와 deadline 스케줄러의 성능 차이가 거의 없습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 파일 서버&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;파일 서버에서는 다양한 접근이 이루어지기에 &lt;u&gt;임의 접근&lt;/u&gt;이 많이 발생하며, 사용자 요청에 따라 많은 양의 I/O가 발생할 수 있습니다. &lt;u&gt;동영상 스크리밍&lt;/u&gt;이나 &lt;u&gt;인코딩 서버&lt;/u&gt; 등 &lt;u&gt;다수의 프로세스가 동등하게 I/O를 요청&lt;/u&gt;하는 경우 &lt;u&gt;cfq 스케줄러&lt;/u&gt;의 성능이 우세합니다. 반면 &lt;u&gt;DB&lt;/u&gt;와 같이 특&lt;u&gt;정 프로세스가 많은 양의 I/O를 요청&lt;/u&gt;하는 경우, time slice에 따라 특정 프로세스의 I/O 요청이 처리되지 않는 idle time이 존재하지 않고 I/O 발생 시간을 기준으로 처리되기 때문에 &lt;u&gt;deadline 스케줄러&lt;/u&gt;가 우세합니다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;361&quot; data-origin-height=&quot;301&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bW3J8L/btsDrjVSYer/KgN6swGr43wSdZm5FAxyG0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bW3J8L/btsDrjVSYer/KgN6swGr43wSdZm5FAxyG0/img.png&quot; data-alt=&quot;멀티 프로세스 환경에서 I/O 스케줄러별 처리 과정&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bW3J8L/btsDrjVSYer/KgN6swGr43wSdZm5FAxyG0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbW3J8L%2FbtsDrjVSYer%2FKgN6swGr43wSdZm5FAxyG0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;361&quot; height=&quot;301&quot; data-origin-width=&quot;361&quot; data-origin-height=&quot;301&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;멀티 프로세스 환경에서 I/O 스케줄러별 처리 과정&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;&amp;nbsp; 3. noop I/O 스케줄러 [ non-multiqueue ]&lt;/b&gt;&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;noop 스케줄러는 정렬 없이 병합 작업만 하는 스케줄러로 아래의 경우에 주로 사용합니다.&lt;/p&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;다른 I/O 스케줄러와 성능을 비교&lt;/li&gt;
&lt;li&gt;Stoarge 자체에 스케줄러가 있는 경우(Intelligent storage 또는 multipath 환경)&lt;/li&gt;
&lt;li&gt;VM을 호스팅하는 경우 (Host와 VM의 스케줄러 각각 동작하므로 성능 저하가 발생할 수 있음)&lt;/li&gt;
&lt;li&gt;SSD를 사용하는 경우 (특정 섹터에 도달하는데 필요한 시간이 모두 동일하므로 불필요한 정렬 작업을 하지 않음)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;964&quot; data-origin-height=&quot;683&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/mkaXm/btsC8VtJFR0/gaBp76MLka8pjzSEVhO43K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/mkaXm/btsC8VtJFR0/gaBp76MLka8pjzSEVhO43K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/mkaXm/btsC8VtJFR0/gaBp76MLka8pjzSEVhO43K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FmkaXm%2FbtsC8VtJFR0%2FgaBp76MLka8pjzSEVhO43K%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;617&quot; height=&quot;437&quot; data-origin-width=&quot;964&quot; data-origin-height=&quot;683&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;&amp;nbsp;Multiqueue 의 배경&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;기존으 HDD는 디스크 헤드가 물리적 이동 시간으로 인한 병목으로 random access에 한계가 존재했습니다. 그러나 멀티 프로세서 시스템과 SSD가 등장하고 성능이 점점 좋아지면서 병렬 처리와 함께 빠른 처리가 가능해졌지만 기존의 single queue에는 아래와 같은 문제가 있었습니다.&lt;/p&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;데이터 무결성을 위한 queue의 상호 배제에서 CPU 간 경합(contention) 발생.&lt;/li&gt;
&lt;li&gt;I/O request queue를 처리하는 CPU(아래 그림의 CPU 0)에서 인터럽트로 인한 잦은 context switching가 발생하는 문제.&lt;/li&gt;
&lt;li&gt;멀티 프로세서 환경으로 인한&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;a style=&quot;color: #0070d1;&quot; href=&quot;https://hopulence.tistory.com/31&quot;&gt;NUMA의 Remote Access&lt;/a&gt; 문제.&lt;/li&gt;
&lt;li&gt;코어 수의 상승에 따른 동작 프로세스와 I/O요청의 증가와 저장 장치의 IOPS(I/O per sec) 사이에서 queue의 밸런스 문제.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;724&quot; data-origin-height=&quot;354&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/lYbRN/btsDNo1Z16l/uKqklRPyy4MHI3PW9L7KZ0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/lYbRN/btsDNo1Z16l/uKqklRPyy4MHI3PW9L7KZ0/img.png&quot; data-alt=&quot;Overview of bottlenecks in the block layer on a system equipped with two cores and a SSD&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/lYbRN/btsDNo1Z16l/uKqklRPyy4MHI3PW9L7KZ0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FlYbRN%2FbtsDNo1Z16l%2FuKqklRPyy4MHI3PW9L7KZ0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;530&quot; height=&quot;259&quot; data-origin-width=&quot;724&quot; data-origin-height=&quot;354&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;Overview of bottlenecks in the block layer on a system equipped with two cores and a SSD&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;결국 single queue는 또다른 병목이 되었고 이러한 문제를 개선하기 위해 software staging queue와 hardware dispatch queue 이렇게 2가지의 queue를 가진 multiqueue(mq)가 등장하였습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;b&gt;software staging queue&lt;/b&gt;는 CPU core또는 소켓 단위에 매핑되어 각 프로세스들의 I/O 요청을 buffering하고 인접한 섹터의 요청들을 병합하여 I/O를 스케줄링 합니다. 이후 각 스케줄러의 로직에 맞게 처리한 이후 자신과 매핑된 hardware dispatch queue로 보냅니다.&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;b&gt;hardware dispatch queue&lt;/b&gt;는 저장장치의 버퍼가 넘치지 않도록 속도를 조절하여 I/O 요청을 드라이버(Submission queue)로 넘겨줍니다. hardware queue의 개수는 driver마다 다르지만 최대 CPU core개수를 넘을 수 없습니다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;2889&quot; data-origin-height=&quot;1727&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/QHQ3M/btsDGLq22SU/w79KHK1XDKBjMgbgYfieHk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/QHQ3M/btsDGLq22SU/w79KHK1XDKBjMgbgYfieHk/img.png&quot; data-alt=&quot;Linux block layer design.&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/QHQ3M/btsDGLq22SU/w79KHK1XDKBjMgbgYfieHk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FQHQ3M%2FbtsDGLq22SU%2Fw79KHK1XDKBjMgbgYfieHk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;796&quot; height=&quot;476&quot; data-origin-width=&quot;2889&quot; data-origin-height=&quot;1727&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;Linux block layer design.&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;또한 multi-queue는 single-queue와 달리 I/O에 tagging을 사용합니다. Driver는 userspace에서 요청한 I/O 작업을 끝내면 고유 정수값을 completion tag로 사용하여 block layer에 넘겨줌으로써 어떤 I/O 요청에 완료되었는지 탐색하는 작업을 줄일 수 있습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;&amp;nbsp; 4. mq-deadline I/O 스케줄러 [ multiqueue ]&lt;/b&gt;&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;기존의 deadline 스케줄러와 동일하지만 multiqueue로 병렬 처리를 활용합니다. deadline을 초과한 I/O를 software queue에서 재정렬하여 hardware queue로 넘겨줍니다.&lt;/p&gt;
&lt;pre id=&quot;code_1705674547876&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;master-1:# ls /sys/block/sda/queue/iosched/
async_depth  fifo_batch  front_merges  read_expire  write_expire  writes_starved&lt;/code&gt;&lt;/pre&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;b&gt;async_depth&lt;/b&gt; : 동시에 처리할 수 있는 ASYNC(쓰기 작업)의 최대 개수를 정의합니다. (Default 48)&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;&amp;nbsp; 5. none I/O 스케줄러 &lt;/b&gt;&lt;b&gt;[ multiqueue ]&lt;/b&gt;&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;FIFO로 구현되어 있으며 noop 스케줄러와 유사합니다. 주로 스케줄링 자체에 대한 오버헤드를 줄이기위해 사용합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;&amp;nbsp; 6. BFQ(Budget Fair Queueing) I/O 스케줄러 &lt;/b&gt;&lt;b&gt;[ multiqueue ]&lt;/b&gt;&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;BFQ 스케줄러는 CFQ 기반이지만 고정된 time slice가 아니라 섹터의 수에 따라 계산되는 budget을 각 프로세스에 할당하여 I/O batch를 스케줄링합니다. 이로써 하나의 Application이 모든 I/O 대여폭을 사용하지 못 하도록 보장하며, I/O 처리량보다는 낮은 Latency에 더 초점을 맞춥니다. 주로 &lt;u&gt;스트리밍 서비스&lt;/u&gt;, &lt;u&gt;packet dump(SPAN)&lt;/u&gt;와 같이 latency나 loss에 민감한 real-time 시스템에 적합합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&amp;nbsp;* bfq 스케줄러의 파라미터&lt;/b&gt;&lt;/p&gt;
&lt;pre id=&quot;code_1705674586312&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;master-1:# ls /sys/block/sda/queue/iosched/
back_seek_max  back_seek_penalty  fifo_expire_async  fifo_expire_sync  low_latency  max_budget  slice_idle  slice_idle_us  strict_guarantees  timeout_sync&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;slice_idle&lt;/b&gt; : cfq 참조. 단일 queue인 CFQ와 달리 multi-queue인 BFQ의 경우, idle time을 주어 한 프로세스 A가 요청한 I/O 요청이 queue A에서 처리되는 중간에 또 다른 queue B에 의한 다른 I/O 요청이 dispatch되지 않도록합니다. 이렇게하여 프로세스 A의 I/O 요청은 &lt;u&gt;연속성을 보장&lt;/u&gt;받을 수 있으며 순차 작업에 대한 처리량도 기대할 수 있게 됩니다. 그러나, 임의 작업의 경우에는 idle time에 따른 처리량 저하가 발생할 수 있습니다. (Default 8ms)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;strict_guarantees&lt;/b&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;: 이 값이 set인 경우 bfq는 queue가 비워질 때 마다 항상 idle time을 가집니다. 그리고 지연되는 요청이 따로 없다면 저장 장치가 한번에 1개의 I/O요청을 처리하도록 합니다. slice_idle는 dispatch 순서를 보장하고, strict_guarantees는 실제 I/O가 처리되는 순서를 보장합니다. (Default 0 = unset)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;low_latency&lt;/b&gt; : cfq와 참조 (Default 'enabled')&lt;/li&gt;
&lt;li&gt;&lt;b&gt;timeout_sync&lt;/b&gt; : 1개의 Task(Queue)가 처리될 수 있는 최대 시간입니다. 탐색 시간이 긴 저장장치는 이 값을 늘려서 처리량을 개선할 수 있습니다. 반대로 탐색 시간이 짧은 장치의 경우, 하나의 task가 CPU를 점유하는 시간이 길어지므로 Latency가 길어질 수 있습니다. (Default 124ms)&amp;nbsp;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;max_budget&lt;/b&gt; : timeout_sync 동안 처리될 수 있는 최대 섹터의 개수(Budget)입니다. 순차 접근이 많은 시스템에서는 이 값을 늘려서 처리량을 개선할 수 있지만, Latency가 길어질 수 있습니다. (Default 0 = auto-tuning enable )&lt;/li&gt;
&lt;li&gt;&lt;b&gt;back_seek_max&lt;/b&gt; : cfq와 참조 (Default 16384 = 16KB)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;back_seek_penalty&lt;/b&gt; : cfq와 참조 (Default 2)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;fifo_expire_asnyc&lt;/b&gt; : cfq와 참조 (Default 250ms)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;fifo_expire_sync&lt;/b&gt; : cfq와 참조 (Default 125ms)&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&amp;nbsp;* cgroup supprt&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;bfq는 cgroup-v1/2 모두 지원하는 스케줄러로 cgroup마다 weight를 주어 전체 I/O 대역폭을 일정 비율로 공유하여 사용할 수 있습니다.&lt;/p&gt;
&lt;pre id=&quot;code_1717129475506&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;master-1:~ # cat /sys/fs/cgroup/blkio/kubepods.slice/blkio.bfq.weight
136&lt;/code&gt;&lt;/pre&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;blkio.bfq.weight : cgroup-v1에서의 weight 설정입니다. low-latency가 enable으면 스케줄러가 자동으로 이 값을 변경합니다. (Default 100)&lt;/li&gt;
&lt;li&gt;io.bfq.weight : cgroup-v2에서의 weight.&lt;/li&gt;
&lt;li&gt;blkio.bfq.weight_device : 저장 장치마다 wieght를 설정할 수 있습니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&amp;nbsp;&lt;/h4&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;&amp;nbsp; 7. kyber I/O 스케줄러 &lt;/b&gt;&lt;b&gt;[ multiqueue ]&lt;/b&gt;&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp; kyber 스케줄러는 동기작업인 read와 비동기작업인 write에 해당하는 2개의 queue로 구성됩니다. 그리고 각 작업의 t&lt;u&gt;arget latency를 설정하고 이를 충족하기 위해 동적으로 I/O 요청을 throttling&lt;/u&gt; 합니다.&amp;nbsp;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;여기서 kyber는 토큰을 사용하여 I/O 처리 시간을 제한합니다. 그리고 Read와 Write에 해당하는 각 토큰이 유효할 때 까지 round-robin으로 hardware queue로 dispatch 합니다. 토큰의 개수가 많을 수록 I/O 작업을 보장받을 수 있는 시간이 길어지므로 target latency를 달성하기 쉬워집니다.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;현재 I/O의 latency가 target latnecy보다 작다면, 이미 target latency를 달성하고 있으므로 현재 토큰 개수를 유지합니다. 반대로, 현재 I/O latency가 target latnecy보다 크다면 토큰의 개수를 증가시켜서 I/O 작업 시간을 늘려 target latency를 달성할 수 있도록 합니다.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;334&quot; data-origin-height=&quot;254&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/9bie0/btsHMQfRIQy/c3XfCYqTGj4PCzufYK3Er1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/9bie0/btsHMQfRIQy/c3XfCYqTGj4PCzufYK3Er1/img.png&quot; data-alt=&quot;Architecture of the Linux Kyber I/O scheduler&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/9bie0/btsHMQfRIQy/c3XfCYqTGj4PCzufYK3Er1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F9bie0%2FbtsHMQfRIQy%2Fc3XfCYqTGj4PCzufYK3Er1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;334&quot; height=&quot;254&quot; data-origin-width=&quot;334&quot; data-origin-height=&quot;254&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;Architecture of the Linux Kyber I/O scheduler&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;현재 latecny &amp;gt; target latency &amp;rarr; 토큰 갯수 증가&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;현재 latency &amp;lt; target latency &amp;rarr; 토큰 갯수 유지&lt;/p&gt;
&lt;pre id=&quot;code_1705674623543&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;master-1:# ls /sys/block/sda/queue/iosched/
read_lat_nsec  write_lat_nsec&lt;/code&gt;&lt;/pre&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;read_lat_nsec :&amp;nbsp; 동기작업 read의 target latency입니다. (Default 2000000ns = 2ms)&lt;/li&gt;
&lt;li&gt;write_lat_nsec : 비동기작업 write의 target latency입니다. (Default 10000000ns = 10ms)&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 style=&quot;text-align: start;&quot; data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;&amp;nbsp; 마치며...&lt;/b&gt;&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;스케줄러 성능 비교를 위해 참고했던 두 자료에서 서로 다른 결과 나왔습니다. 변인이나 기준이 달라서 비교하기는 어렵지만, S&lt;u&gt;SD의 성능이 좋아 질수록 구현이 단순한 Kyber가 locking 등으로 CPU 오버헤드가 있는 mq-deadline과 bfq보다 상대적으로 뛰어난 성능&lt;/u&gt;을 보여주는 듯 합니다. 그리고 이제는 I/O의 병목이 디스크가 아닌 CPU라고 얘기하기도 하네요.&amp;nbsp; &amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;921&quot; data-origin-height=&quot;251&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cifMzp/btsH00Xp0JP/YOIvoiHdc60DheGzKj0HM1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cifMzp/btsH00Xp0JP/YOIvoiHdc60DheGzKj0HM1/img.png&quot; data-alt=&quot;SSD 성능 비교&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cifMzp/btsH00Xp0JP/YOIvoiHdc60DheGzKj0HM1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcifMzp%2FbtsH00Xp0JP%2FYOIvoiHdc60DheGzKj0HM1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;921&quot; height=&quot;251&quot; data-origin-width=&quot;921&quot; data-origin-height=&quot;251&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;SSD 성능 비교&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://www.phoronix.com/review/linux-56-nvme&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;I/O 스케줄러 BMT with &lt;/a&gt;&lt;a style=&quot;background-color: #e6f5ff; color: #0070d1; text-align: start;&quot; href=&quot;https://www.phoronix.com/review/linux-56-nvme&quot;&gt;CORSAIR SSD&amp;nbsp;&lt;/a&gt;&lt;a href=&quot;https://www.phoronix.com/review/linux-56-nvme&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt; (2020)&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a style=&quot;background-color: #e6f5ff; color: #0070d1; text-align: start;&quot; href=&quot;https://atlarge-research.com/pdfs/2024-io-schedulers.pdf&quot;&gt;I/O 스케줄러 BMT with&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/a&gt;&lt;a style=&quot;background-color: #e6f5ff; color: #0070d1; text-align: start;&quot; href=&quot;https://atlarge-research.com/pdfs/2024-io-schedulers.pdf&quot;&gt;삼성 SSD&lt;/a&gt;&lt;a style=&quot;color: #0070d1; text-align: start;&quot; href=&quot;https://atlarge-research.com/pdfs/2024-io-schedulers.pdf&quot;&gt;&amp;nbsp;(2024)&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;Reference&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://ji007.tistory.com/entry/IO-Schedulers&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://ji007.tistory.com/entry/IO-Schedulers&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1703851655152&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;I/O Schedulers&quot; data-og-description=&quot;I/O Scheduler 는 block layer에 해당하며 file system으로부터 bio structure에 대한 request submit 을 받아 Block I/O에 대한 동작 요청을 I/O Request queue로 전달하는 역할을 한다. I/O Request queue에 전달된 I/O는 device dr&quot; data-og-host=&quot;ji007.tistory.com&quot; data-og-source-url=&quot;https://ji007.tistory.com/entry/IO-Schedulers&quot; data-og-url=&quot;https://ji007.tistory.com/entry/IO-Schedulers&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/cSn79u/hyUXX0eT4S/wrDwe8uRBKmNvKv3rGgjw0/img.png?width=748&amp;amp;height=739&amp;amp;face=0_0_748_739,https://scrap.kakaocdn.net/dn/TQCHC/hyUTJif9uF/qJlrxh0BxpSCK7NwaDDRcK/img.png?width=748&amp;amp;height=739&amp;amp;face=0_0_748_739,https://scrap.kakaocdn.net/dn/V4ulH/hyUTw4j877/lQjPbbVCFaE3ko00bKs5Ik/img.png?width=748&amp;amp;height=739&amp;amp;face=0_0_748_739&quot;&gt;&lt;a href=&quot;https://ji007.tistory.com/entry/IO-Schedulers&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://ji007.tistory.com/entry/IO-Schedulers&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/cSn79u/hyUXX0eT4S/wrDwe8uRBKmNvKv3rGgjw0/img.png?width=748&amp;amp;height=739&amp;amp;face=0_0_748_739,https://scrap.kakaocdn.net/dn/TQCHC/hyUTJif9uF/qJlrxh0BxpSCK7NwaDDRcK/img.png?width=748&amp;amp;height=739&amp;amp;face=0_0_748_739,https://scrap.kakaocdn.net/dn/V4ulH/hyUTw4j877/lQjPbbVCFaE3ko00bKs5Ik/img.png?width=748&amp;amp;height=739&amp;amp;face=0_0_748_739');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;I/O Schedulers&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;I/O Scheduler 는 block layer에 해당하며 file system으로부터 bio structure에 대한 request submit 을 받아 Block I/O에 대한 동작 요청을 I/O Request queue로 전달하는 역할을 한다. I/O Request queue에 전달된 I/O는 device dr&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;ji007.tistory.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://blog.mi.hdm-stuttgart.de/index.php/2022/04/01/improve-your-storage-i-o-performance-today/&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://blog.mi.hdm-stuttgart.de/index.php/2022/04/01/improve-your-storage-i-o-performance-today/&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1703852141530&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;Improve your storage I/O performance today | Computer Science Blog @ HdM Stuttgart&quot; data-og-description=&quot;an article by Lucas Cr&amp;auml;mer and Jannik Smidt DISCLAIMER This post tries to keep the complexity manageable while making a point clear. We are not systems engineers/kernel developers, so feel free to point out any mistakes/misunderstandings. This post probab&quot; data-og-host=&quot;blog.mi.hdm-stuttgart.de&quot; data-og-source-url=&quot;https://blog.mi.hdm-stuttgart.de/index.php/2022/04/01/improve-your-storage-i-o-performance-today/&quot; data-og-url=&quot;https://blog.mi.hdm-stuttgart.de/index.php/2022/04/01/improve-your-storage-i-o-performance-today/&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/qto9r/hyUTKuKgqV/rc5CTkfMVtrVaywhpzkOT1/img.png?width=1129&amp;amp;height=1600&amp;amp;face=0_0_1129_1600,https://scrap.kakaocdn.net/dn/bDqUfI/hyUXJVedUQ/IFEGqHJnqwgLsc993zLIGK/img.png?width=700&amp;amp;height=404&amp;amp;face=0_0_700_404&quot;&gt;&lt;a href=&quot;https://blog.mi.hdm-stuttgart.de/index.php/2022/04/01/improve-your-storage-i-o-performance-today/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://blog.mi.hdm-stuttgart.de/index.php/2022/04/01/improve-your-storage-i-o-performance-today/&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/qto9r/hyUTKuKgqV/rc5CTkfMVtrVaywhpzkOT1/img.png?width=1129&amp;amp;height=1600&amp;amp;face=0_0_1129_1600,https://scrap.kakaocdn.net/dn/bDqUfI/hyUXJVedUQ/IFEGqHJnqwgLsc993zLIGK/img.png?width=700&amp;amp;height=404&amp;amp;face=0_0_700_404');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Improve your storage I/O performance today | Computer Science Blog @ HdM Stuttgart&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;an article by Lucas Cr&amp;auml;mer and Jannik Smidt DISCLAIMER This post tries to keep the complexity manageable while making a point clear. We are not systems engineers/kernel developers, so feel free to point out any mistakes/misunderstandings. This post probab&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;blog.mi.hdm-stuttgart.de&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://blog.csdn.net/weixin_33963189/article/details/92562519&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://blog.csdn.net/weixin_33963189/article/details/92562519&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1703856173630&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;linux io过程自顶向下分析-CSDN博客&quot; data-og-description=&quot;前言 　　IO是操作系统内核最重要的组成部分之一，它的概念很广，本文主要针对的是普通文件与设备文件的IO原理，采用自顶向下的方式，去探究从用户态的fread,fwrite函数到底层的数据是如何&quot; data-og-host=&quot;blog.csdn.net&quot; data-og-source-url=&quot;https://blog.csdn.net/weixin_33963189/article/details/92562519&quot; data-og-url=&quot;https://blog.csdn.net/weixin_33963189/article/details/92562519&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/cN6RlO/hyUTwpIuQM/8BM4S85ZnAkpPhWSTVyXWK/img.jpg?width=1846&amp;amp;height=2852&amp;amp;face=0_0_1846_2852,https://scrap.kakaocdn.net/dn/bQjA2z/hyUTBYRlrv/kePvLGKgnBJiNy4sMTcM1k/img.png?width=697&amp;amp;height=661&amp;amp;face=0_0_697_661&quot;&gt;&lt;a href=&quot;https://blog.csdn.net/weixin_33963189/article/details/92562519&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://blog.csdn.net/weixin_33963189/article/details/92562519&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/cN6RlO/hyUTwpIuQM/8BM4S85ZnAkpPhWSTVyXWK/img.jpg?width=1846&amp;amp;height=2852&amp;amp;face=0_0_1846_2852,https://scrap.kakaocdn.net/dn/bQjA2z/hyUTBYRlrv/kePvLGKgnBJiNy4sMTcM1k/img.png?width=697&amp;amp;height=661&amp;amp;face=0_0_697_661');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;linux io过程自顶向下分析-CSDN博客&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;前言 　　IO是操作系统内核最重要的组成部分之一，它的概念很广，本文主要针对的是普通文件与设备文件的IO原理，采用自顶向下的方式，去探究从用户态的fread,fwrite函数到底层的数据是如何&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;blog.csdn.net&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://www.kernel.org/doc/Documentation/block/cfq-iosched.txt&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://www.kernel.org/doc/Documentation/block/cfq-iosched.txt&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://dergipark.org.tr/tr/download/article-file/595488&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://dergipark.org.tr/tr/download/article-file/595488&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/performance_tuning_guide/ch06s04&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/performance_tuning_guide/ch06s04&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1704716684438&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;6.4.&amp;nbsp;Configuration Red Hat Enterprise Linux 6 | Red Hat Customer Portal&quot; data-og-description=&quot;Access Red Hat&amp;rsquo;s knowledge, guidance, and support through your subscription.&quot; data-og-host=&quot;access.redhat.com&quot; data-og-source-url=&quot;https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/performance_tuning_guide/ch06s04&quot; data-og-url=&quot;https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/performance_tuning_guide/ch06s04&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/eP9Z9/hyU2iKCHln/X3cs4vkpgILSsucPJdWh9k/img.png?width=200&amp;amp;height=200&amp;amp;face=0_0_200_200,https://scrap.kakaocdn.net/dn/rP2Wv/hyU2mlWKTn/r1pkLILjEJ8kXuKekYn1Fk/img.png?width=200&amp;amp;height=200&amp;amp;face=0_0_200_200&quot;&gt;&lt;a href=&quot;https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/performance_tuning_guide/ch06s04&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/performance_tuning_guide/ch06s04&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/eP9Z9/hyU2iKCHln/X3cs4vkpgILSsucPJdWh9k/img.png?width=200&amp;amp;height=200&amp;amp;face=0_0_200_200,https://scrap.kakaocdn.net/dn/rP2Wv/hyU2mlWKTn/r1pkLILjEJ8kXuKekYn1Fk/img.png?width=200&amp;amp;height=200&amp;amp;face=0_0_200_200');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;6.4.&amp;nbsp;Configuration Red Hat Enterprise Linux 6 | Red Hat Customer Portal&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;Access Red Hat&amp;rsquo;s knowledge, guidance, and support through your subscription.&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;access.redhat.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/monitoring_and_managing_system_status_and_performance/setting-the-disk-scheduler_monitoring-and-managing-system-status-and-performance&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/monitoring_and_managing_system_status_and_performance/setting-the-disk-scheduler_monitoring-and-managing-system-status-and-performance&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1706024188572&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;Chapter&amp;nbsp;12.&amp;nbsp;Setting the disk scheduler Red Hat Enterprise Linux 8 | Red Hat Customer Portal&quot; data-og-description=&quot;Access Red Hat&amp;rsquo;s knowledge, guidance, and support through your subscription.&quot; data-og-host=&quot;access.redhat.com&quot; data-og-source-url=&quot;https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/monitoring_and_managing_system_status_and_performance/setting-the-disk-scheduler_monitoring-and-managing-system-status-and-performance&quot; data-og-url=&quot;https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/monitoring_and_managing_system_status_and_performance/setting-the-disk-scheduler_monitoring-and-managing-system-status-and-performance&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/cwtihe/hyU8YMSFfg/oJAPVGtkYdl6MjHDGEDry1/img.png?width=200&amp;amp;height=200&amp;amp;face=0_0_200_200,https://scrap.kakaocdn.net/dn/FWYYM/hyU8ZZklU1/gStMSA68aEgDO41cNcFeo1/img.png?width=200&amp;amp;height=200&amp;amp;face=0_0_200_200&quot;&gt;&lt;a href=&quot;https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/monitoring_and_managing_system_status_and_performance/setting-the-disk-scheduler_monitoring-and-managing-system-status-and-performance&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/monitoring_and_managing_system_status_and_performance/setting-the-disk-scheduler_monitoring-and-managing-system-status-and-performance&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/cwtihe/hyU8YMSFfg/oJAPVGtkYdl6MjHDGEDry1/img.png?width=200&amp;amp;height=200&amp;amp;face=0_0_200_200,https://scrap.kakaocdn.net/dn/FWYYM/hyU8ZZklU1/gStMSA68aEgDO41cNcFeo1/img.png?width=200&amp;amp;height=200&amp;amp;face=0_0_200_200');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Chapter&amp;nbsp;12.&amp;nbsp;Setting the disk scheduler Red Hat Enterprise Linux 8 | Red Hat Customer Portal&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;Access Red Hat&amp;rsquo;s knowledge, guidance, and support through your subscription.&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;access.redhat.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://wiki.ubuntu.com/Kernel/Reference/IOSchedulers&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://wiki.ubuntu.com/Kernel/Reference/IOSchedulers&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1704766364684&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;Kernel/Reference/IOSchedulers - Ubuntu Wiki&quot; data-og-description=&quot;Linux I/O schedulers I/O schedulers attempt to improve throughput by reordering request access into a linear order based on the logical addresses of the data and trying to group these together. While this may increase overall throughput it may lead to some&quot; data-og-host=&quot;wiki.ubuntu.com&quot; data-og-source-url=&quot;https://wiki.ubuntu.com/Kernel/Reference/IOSchedulers&quot; data-og-url=&quot;https://wiki.ubuntu.com/Kernel/Reference/IOSchedulers&quot; data-og-image=&quot;&quot;&gt;&lt;a href=&quot;https://wiki.ubuntu.com/Kernel/Reference/IOSchedulers&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://wiki.ubuntu.com/Kernel/Reference/IOSchedulers&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url();&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Kernel/Reference/IOSchedulers - Ubuntu Wiki&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;Linux I/O schedulers I/O schedulers attempt to improve throughput by reordering request access into a linear order based on the logical addresses of the data and trying to group these together. While this may increase overall throughput it may lead to some&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;wiki.ubuntu.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://documentation.suse.com/ko-kr/sles/12-SP5/html/SLES-all/cha-tuning-io.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://documentation.suse.com/ko-kr/sles/12-SP5/html/SLES-all/cha-tuning-io.html&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1704778038543&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;Tuning I/O Performance | SLES 12 SP5&quot; data-og-description=&quot;I/O scheduling controls how input/output operations will be subm&amp;hellip;&quot; data-og-host=&quot;documentation.suse.com&quot; data-og-source-url=&quot;https://documentation.suse.com/ko-kr/sles/12-SP5/html/SLES-all/cha-tuning-io.html&quot; data-og-url=&quot;https://documentation.suse.com/ko-kr/sles/12-SP5/html/SLES-all/cha-tuning-io.html&quot; data-og-image=&quot;&quot;&gt;&lt;a href=&quot;https://documentation.suse.com/ko-kr/sles/12-SP5/html/SLES-all/cha-tuning-io.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://documentation.suse.com/ko-kr/sles/12-SP5/html/SLES-all/cha-tuning-io.html&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url();&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Tuning I/O Performance | SLES 12 SP5&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;I/O scheduling controls how input/output operations will be subm&amp;hellip;&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;documentation.suse.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://kernel.dk/blk-mq.pdf&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://kernel.dk/blk-mq.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://www.kernel.org/doc/html/latest/block/index.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://www.kernel.org/doc/html/latest/block/index.html&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1705728750907&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;Block &amp;mdash; The Linux Kernel  documentation&quot; data-og-description=&quot;&quot; data-og-host=&quot;www.kernel.org&quot; data-og-source-url=&quot;https://www.kernel.org/doc/html/latest/block/index.html&quot; data-og-url=&quot;https://www.kernel.org/doc/html/latest/block/index.html&quot; data-og-image=&quot;&quot;&gt;&lt;a href=&quot;https://www.kernel.org/doc/html/latest/block/index.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://www.kernel.org/doc/html/latest/block/index.html&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url();&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Block &amp;mdash; The Linux Kernel documentation&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;www.kernel.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://documentation.suse.com/sles/15-SP2/html/SLES-all/cha-tuning-cgroups.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://documentation.suse.com/sles/15-SP2/html/SLES-all/cha-tuning-cgroups.html&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1717127394070&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;Kernel Control Groups | SLES 15 SP2&quot; data-og-description=&quot;Kernel Control Groups (cgroups) are a kernel feature that allows&amp;hellip;&quot; data-og-host=&quot;documentation.suse.com&quot; data-og-source-url=&quot;https://documentation.suse.com/sles/15-SP2/html/SLES-all/cha-tuning-cgroups.html&quot; data-og-url=&quot;https://documentation.suse.com/sles/15-SP2/html/SLES-all/cha-tuning-cgroups.html&quot; data-og-image=&quot;&quot;&gt;&lt;a href=&quot;https://documentation.suse.com/sles/15-SP2/html/SLES-all/cha-tuning-cgroups.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://documentation.suse.com/sles/15-SP2/html/SLES-all/cha-tuning-cgroups.html&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url();&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Kernel Control Groups | SLES 15 SP2&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;Kernel Control Groups (cgroups) are a kernel feature that allows&amp;hellip;&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;documentation.suse.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://atlarge-research.com/pdfs/2024-io-schedulers.pdf&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://atlarge-research.com/pdfs/2024-io-schedulers.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://marc.info/?l=linux-block&amp;amp;m=149180716319351&amp;amp;w=2&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://marc.info/?l=linux-block&amp;amp;m=149180716319351&amp;amp;w=2&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1717330616336&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;'[PATCH v3 5/5] blk-mq: introduce Kyber multiqueue I/O scheduler' - MARC&quot; data-og-description=&quot;&quot; data-og-host=&quot;marc.info&quot; data-og-source-url=&quot;https://marc.info/?l=linux-block&amp;amp;m=149180716319351&amp;amp;w=2&quot; data-og-url=&quot;https://marc.info/?l=linux-block&amp;amp;m=149180716319351&amp;amp;w=2&quot; data-og-image=&quot;&quot;&gt;&lt;a href=&quot;https://marc.info/?l=linux-block&amp;amp;m=149180716319351&amp;amp;w=2&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://marc.info/?l=linux-block&amp;amp;m=149180716319351&amp;amp;w=2&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url();&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;'[PATCH v3 5/5] blk-mq: introduce Kyber multiqueue I/O scheduler' - MARC&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;marc.info&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://www.phoronix.com/review/linux-56-nvme&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://www.phoronix.com/review/linux-56-nvme&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1718436166882&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;Linux 5.6 I/O Scheduler Benchmarks: None, Kyber, BFQ, MQ-Deadline - Phoronix&quot; data-og-description=&quot;While some Linux distributions are still using MQ-Deadline or Kyber by default for NVMe SSD storage, using no I/O scheduler still tends to perform the best overall for this speedy storage medium. In curious about the current state of the I/O schedulers wit&quot; data-og-host=&quot;www.phoronix.com&quot; data-og-source-url=&quot;https://www.phoronix.com/review/linux-56-nvme&quot; data-og-url=&quot;https://www.phoronix.com/review/linux-56-nvme&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/Be8px/hyWlhR43Ix/BFF96EIEhrS3hQ9clujHo1/img.jpg?width=695&amp;amp;height=114&amp;amp;face=0_0_695_114&quot;&gt;&lt;a href=&quot;https://www.phoronix.com/review/linux-56-nvme&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://www.phoronix.com/review/linux-56-nvme&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/Be8px/hyWlhR43Ix/BFF96EIEhrS3hQ9clujHo1/img.jpg?width=695&amp;amp;height=114&amp;amp;face=0_0_695_114');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Linux 5.6 I/O Scheduler Benchmarks: None, Kyber, BFQ, MQ-Deadline - Phoronix&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;While some Linux distributions are still using MQ-Deadline or Kyber by default for NVMe SSD storage, using no I/O scheduler still tends to perform the best overall for this speedy storage medium. In curious about the current state of the I/O schedulers wit&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;www.phoronix.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>System Engineering/Linux</category>
      <author>Hopulence</author>
      <guid isPermaLink="true">https://hopulence.tistory.com/43</guid>
      <comments>https://hopulence.tistory.com/43#entry43comment</comments>
      <pubDate>Tue, 12 Dec 2023 14:16:35 +0900</pubDate>
    </item>
    <item>
      <title>Cloud를 위한 VXLAN part.1 - LAN 통신</title>
      <link>https://hopulence.tistory.com/41</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;Legacy network와 VXLAN을 공부하면서 기록한 내용을 정리해봅니다. 짧은 식견이라 오류가 있을 수 있습니다. 지적해주시면 감사드리겠습니다  &lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;VXLAN의 본론에 앞서 실무하면서 잊어버렸던 Network 기초부터 기록한 뒤 part.2,3에서 VXLAN을 정리할 예정입니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;Part.1 - LAN 통신&lt;/b&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;Network와 3 Table (Routing, ARP, MAC)&lt;/li&gt;
&lt;li&gt;VXLAN을 사용하는 이유 1 - STP(Spanning Tree Protocol)와 Broadcasting storm&lt;/li&gt;
&lt;li&gt;VXLAN을 사용하는 이유 2 - MAC/ARP table으 한계&lt;/li&gt;
&lt;li&gt;VXLAN을 사용하는 이유 3 - VLAN의 한계와 확장성&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt; Part.2 - VXLAN&lt;/b&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;s&gt;VXLAN 용어 정리&lt;/s&gt;&lt;/li&gt;
&lt;li&gt;&lt;s&gt;VXLAN의 동작 방식 (VM-to-VM unicast)&lt;/s&gt;&lt;/li&gt;
&lt;li&gt;&lt;s&gt;VLXN 헤더 : MAC-in-UDP encapsulation&lt;/s&gt;&lt;/li&gt;
&lt;li&gt;&lt;s&gt;Manual VXLAN (No control palne)&lt;/s&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;Part.3 - MPLS와 EVPN&lt;/b&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;s&gt;VXLAN과 MPLS&lt;/s&gt;&lt;/li&gt;
&lt;li&gt;&lt;s&gt;VXLAN과 EVPN (Ethernet VPN)&lt;/s&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;Network와 3 Table&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;여러 환경에서 두 시스템이 통신하기 위해서는 필요에 따라 각각 Routing, ARP, MAC Table이 필요합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&amp;nbsp;1. 직접 통신 (동일 subnet / vlan)&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp; 통신 대상이 같은 subnet에 존재하면 ARP resolution을 통해 MAC을 Learning 하고 직접 Ethernet frame을 전송합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;아래 그림처럼 두 시스템에 동일 subnet을 지정하고 연결하면 Gateway를 설정해 주지 않아도 각 Routing table에 connected entry가 생성됩니다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;682&quot; data-origin-height=&quot;207&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/QkKkE/btsFFXXMG78/s9ircBjYlpoRocVNbKjoz1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/QkKkE/btsFFXXMG78/s9ircBjYlpoRocVNbKjoz1/img.png&quot; data-alt=&quot;Direct Connection&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/QkKkE/btsFFXXMG78/s9ircBjYlpoRocVNbKjoz1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FQkKkE%2FbtsFFXXMG78%2Fs9ircBjYlpoRocVNbKjoz1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;682&quot; height=&quot;207&quot; data-origin-width=&quot;682&quot; data-origin-height=&quot;207&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;Direct Connection&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&amp;nbsp;* 직결 통신에서 목적지 IP를 알아도 MAC을 모르면 Ethernet frame을 만들 수 없기 때문에 ARP로 MAC learning이 필요합니다.&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;동일한 vlan이 설정된 L2 네트워크의 경우도 똑같이 Gateway의 설정 없이 직접 ARP request를 전송하여 각 시스템에서 Routing table에 connected entry를 생성합니다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;525&quot; data-origin-height=&quot;304&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/c1OyRz/btsFJ9hr6yl/L4z5VGwKW2B5rIRRMBqv70/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/c1OyRz/btsFJ9hr6yl/L4z5VGwKW2B5rIRRMBqv70/img.png&quot; data-alt=&quot;L2 통신&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/c1OyRz/btsFJ9hr6yl/L4z5VGwKW2B5rIRRMBqv70/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fc1OyRz%2FbtsFJ9hr6yl%2FL4z5VGwKW2B5rIRRMBqv70%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;525&quot; height=&quot;304&quot; data-origin-width=&quot;525&quot; data-origin-height=&quot;304&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;L2 통신&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #333333; text-align: start;&quot;&gt;&amp;nbsp;스위치에서 기본 동작으로 Learning, Flooding, Forwarding, Aging, Filtering 5가지 동작을 포함하는&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;b&gt;&lt;u&gt;Transparent Bridging&lt;/u&gt;&lt;/b&gt;를 수행해서 MAC을 Learing하고 내부 DB에서 관리합니다. 즉, Worker 1과 2가 연결된 스위치의 port에서 직결된 인터페이스의 MAC과 vlan을 학습합니다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;940&quot; data-origin-height=&quot;204&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/F3Qpx/btsFJsIz8w3/qX69COlkfUsEMHknF7IUO0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/F3Qpx/btsFJsIz8w3/qX69COlkfUsEMHknF7IUO0/img.png&quot; data-alt=&quot;스위치의 MAC table&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/F3Qpx/btsFJsIz8w3/qX69COlkfUsEMHknF7IUO0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FF3Qpx%2FbtsFJsIz8w3%2FqX69COlkfUsEMHknF7IUO0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;940&quot; height=&quot;204&quot; data-origin-width=&quot;940&quot; data-origin-height=&quot;204&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;스위치의 MAC table&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;2. 간접 통신 (다른 subnet / vlan)&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;: 통신 대상이 다른 subnet에 있으면 L3 스위치(Gateway)가 필요합니다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;651&quot; data-origin-height=&quot;368&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/pVg4i/btsFFZg1ohU/VoKVDJIU4qPbnGpkKsuK5k/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/pVg4i/btsFFZg1ohU/VoKVDJIU4qPbnGpkKsuK5k/img.png&quot; data-alt=&quot;L3 통신&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/pVg4i/btsFFZg1ohU/VoKVDJIU4qPbnGpkKsuK5k/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FpVg4i%2FbtsFFZg1ohU%2FVoKVDJIU4qPbnGpkKsuK5k%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;651&quot; height=&quot;368&quot; data-origin-width=&quot;651&quot; data-origin-height=&quot;368&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;L3 통신&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;*&lt;b&gt; VLAN + Broadcast 도메인과 ARP&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;앞에서 언급한 스위치의 Transparent bridging에서 자신의 DB에서 없는 MAC을 목적지로 하는 패킷이 유입되면 동일한 vlan에 해당하는 port로 filtering하여 flooding 합니다. 이러한 과정을 &lt;b&gt;L2 braodcast&lt;/b&gt;라고 합니다. 동일한 subnet / vlan은 동일한 broadcast 도메인으로 볼 수 있으며, 위와 같이 MAC을 모르는 경우나 ARP request가 발생한 경우 스위치는 broadcast를 수행합니다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1519&quot; data-origin-height=&quot;667&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/syYFK/btsFJG7C4fy/QpugFtCnMwVwpc5zYFhsBk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/syYFK/btsFJG7C4fy/QpugFtCnMwVwpc5zYFhsBk/img.png&quot; data-alt=&quot;ARP Resolution&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/syYFK/btsFJG7C4fy/QpugFtCnMwVwpc5zYFhsBk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FsyYFK%2FbtsFJG7C4fy%2FQpugFtCnMwVwpc5zYFhsBk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;974&quot; height=&quot;428&quot; data-origin-width=&quot;1519&quot; data-origin-height=&quot;667&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;ARP Resolution&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;아래 그림처럼 서로 다른 v&lt;u&gt;lan tag를 통해 Broadcast doamin을 분리&lt;/u&gt;하는 것이 VLAN을 주 목적입니다. 하나의 망에서 VLAN이 없다면 모든 단말(endpoint)로 broadcast 패킷이 전파되지만, VLAN을 통한 분리로 망의 대역폭과 네트워크 리소스를 절약할 수 있습니다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;856&quot; data-origin-height=&quot;955&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/vzJka/btsFF8dxCSb/tkb1x3jVZNfS4zOkL4ixZ0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/vzJka/btsFF8dxCSb/tkb1x3jVZNfS4zOkL4ixZ0/img.png&quot; data-alt=&quot;VLAN과 Broadcast doamin&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/vzJka/btsFF8dxCSb/tkb1x3jVZNfS4zOkL4ixZ0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FvzJka%2FbtsFF8dxCSb%2Ftkb1x3jVZNfS4zOkL4ixZ0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;663&quot; height=&quot;740&quot; data-origin-width=&quot;856&quot; data-origin-height=&quot;955&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;VLAN과 Broadcast doamin&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;VXLAN을 사용하는 이유 1 - STP(Spanning Tree Protocol)과 Broadcasting storm&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;지금의 클라우드 환경의 네트워크에서는 사용하지 않지만, Legacy 환경에서는 Packet looping을 방지하기 위해서 STP 설정이 필수였습니다. 아래 형상처럼 L2 스위치가 삼각 구조를 이루는 경우에서 스위치가 알지 못하는 MAC을 바라보는 패킷이 유입되는 경우 스위치는 MAC을 알기 위해 broadcastin합니다. 하지만 내부 망의 어떤 스위치도 MAC을 모른다면 boradcast를 전파한 스위치로 동일한 boradcast가 다시 전파되면서 Loop이 발생하여 CPU 점유율이 100%에 도달하여 시스템이 멈추게 되는 &lt;u&gt;&lt;b&gt;broadcasting storm&lt;/b&gt;&lt;/u&gt;이 발생하게 됩니다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1046&quot; data-origin-height=&quot;340&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b7nD8v/btsFF8q7XDn/wT0oKIpawqakWg0BUXZps0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b7nD8v/btsFF8q7XDn/wT0oKIpawqakWg0BUXZps0/img.png&quot; data-alt=&quot;STP와 Broadcasting storm&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b7nD8v/btsFF8q7XDn/wT0oKIpawqakWg0BUXZps0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb7nD8v%2FbtsFF8q7XDn%2FwT0oKIpawqakWg0BUXZps0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1046&quot; height=&quot;340&quot; data-origin-width=&quot;1046&quot; data-origin-height=&quot;340&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;STP와 Broadcasting storm&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;STP는 이러한 Loop을 방지하기위해 삼각 구조의 망의 회선마다 Priority를 설정하고 가장 높은 스위치의 회선을 Block합니다.&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;(작성중인 포스팅입니다.)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;852&quot; data-origin-height=&quot;363&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/sPxP5/btsFGKiTWpj/FIvCQ5a4acWqYckmxkMuVk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/sPxP5/btsFGKiTWpj/FIvCQ5a4acWqYckmxkMuVk/img.png&quot; data-alt=&quot;VLAN ID별 용도&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/sPxP5/btsFGKiTWpj/FIvCQ5a4acWqYckmxkMuVk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FsPxP5%2FbtsFGKiTWpj%2FFIvCQ5a4acWqYckmxkMuVk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;619&quot; height=&quot;264&quot; data-origin-width=&quot;852&quot; data-origin-height=&quot;363&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;VLAN ID별 용도&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;800&quot; data-origin-height=&quot;770&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/VfSV7/btsFHmvqf67/vwwWi32vtUPw7KCksxtmN1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/VfSV7/btsFHmvqf67/vwwWi32vtUPw7KCksxtmN1/img.png&quot; data-alt=&quot;VM migration에서 VLAN의 제약&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/VfSV7/btsFHmvqf67/vwwWi32vtUPw7KCksxtmN1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FVfSV7%2FbtsFHmvqf67%2FvwwWi32vtUPw7KCksxtmN1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;626&quot; height=&quot;603&quot; data-origin-width=&quot;800&quot; data-origin-height=&quot;770&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;VM migration에서 VLAN의 제약&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;Reference&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://info.support.huawei.com/info-finder/encyclopedia/en/VXLAN.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://info.support.huawei.com/info-finder/encyclopedia/en/VXLAN.html&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1702360552243&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;What Is VXLAN? How Does It Differ from VLAN? - Huawei&quot; data-og-description=&quot;VXLAN is a tunneling technology used on large Layer 2 networks, and transmits packets over a VXLAN tunnel between source and destination devices.&quot; data-og-host=&quot;info.support.huawei.com&quot; data-og-source-url=&quot;https://info.support.huawei.com/info-finder/encyclopedia/en/VXLAN.html&quot; data-og-url=&quot;https://info.support.huawei.com/info-finder/encyclopedia/en/VXLAN.html&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/QR2sj/hyUIEu2Zss/06KueeQkzB0oczv4Vko1Q0/img.png?width=787&amp;amp;height=443&amp;amp;face=0_0_787_443,https://scrap.kakaocdn.net/dn/pZxTI/hyULRGk2v0/asb2iaQj9IkOcOzMr9nsm1/img.png?width=443&amp;amp;height=427&amp;amp;face=0_0_443_427,https://scrap.kakaocdn.net/dn/c8vXi8/hyUL0Df3dC/TWTQT7hEyGyg8zY5iU3KG1/img.png?width=364&amp;amp;height=349&amp;amp;face=0_0_364_349&quot;&gt;&lt;a href=&quot;https://info.support.huawei.com/info-finder/encyclopedia/en/VXLAN.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://info.support.huawei.com/info-finder/encyclopedia/en/VXLAN.html&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/QR2sj/hyUIEu2Zss/06KueeQkzB0oczv4Vko1Q0/img.png?width=787&amp;amp;height=443&amp;amp;face=0_0_787_443,https://scrap.kakaocdn.net/dn/pZxTI/hyULRGk2v0/asb2iaQj9IkOcOzMr9nsm1/img.png?width=443&amp;amp;height=427&amp;amp;face=0_0_443_427,https://scrap.kakaocdn.net/dn/c8vXi8/hyUL0Df3dC/TWTQT7hEyGyg8zY5iU3KG1/img.png?width=364&amp;amp;height=349&amp;amp;face=0_0_364_349');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;What Is VXLAN? How Does It Differ from VLAN? - Huawei&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;VXLAN is a tunneling technology used on large Layer 2 networks, and transmits packets over a VXLAN tunnel between source and destination devices.&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;info.support.huawei.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc7348&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://datatracker.ietf.org/doc/html/rfc7348&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1702360713061&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;RFC 7348: Virtual eXtensible Local Area Network (VXLAN): A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Ne&quot; data-og-description=&quot;This document describes Virtual eXtensible Local Area Network (VXLAN), which is used to address the need for overlay networks within virtualized data centers accommodating multiple tenants. The scheme and the related protocols can be used in networks for c&quot; data-og-host=&quot;datatracker.ietf.org&quot; data-og-source-url=&quot;https://datatracker.ietf.org/doc/html/rfc7348&quot; data-og-url=&quot;https://datatracker.ietf.org/doc/html/rfc7348&quot; data-og-image=&quot;&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc7348&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://datatracker.ietf.org/doc/html/rfc7348&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url();&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;RFC 7348: Virtual eXtensible Local Area Network (VXLAN): A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Ne&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;This document describes Virtual eXtensible Local Area Network (VXLAN), which is used to address the need for overlay networks within virtualized data centers accommodating multiple tenants. The scheme and the related protocols can be used in networks for c&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;datatracker.ietf.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://white-polarbear.tistory.com/100&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://white-polarbear.tistory.com/100&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1703061366638&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;Hierarchical LAN design Model (with. 네트워크 디자인)&quot; data-og-description=&quot;● Hierarchical LAN design Model 소개 Hierarchical LAN design Model (계층적 LAN 디자인)은 엔터프라이즈 네트워크 아키텍처를 모듈형으로 나누어 각각의 모듈에서 기능을 수행하는 구조를 의미 합니다. 계층&quot; data-og-host=&quot;white-polarbear.tistory.com&quot; data-og-source-url=&quot;https://white-polarbear.tistory.com/100&quot; data-og-url=&quot;https://white-polarbear.tistory.com/100&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/ftzDt/hyUPz60pW4/wZtz4Ff4tEmv9hERe1CsxK/img.png?width=553&amp;amp;height=505&amp;amp;face=0_0_553_505,https://scrap.kakaocdn.net/dn/CESIy/hyUPDVSfS9/RGnC3O5m90a20TnGjiGID0/img.png?width=553&amp;amp;height=505&amp;amp;face=0_0_553_505,https://scrap.kakaocdn.net/dn/qUKbe/hyUPBcG53V/N5nEtSC3itWDs7wayzMSzk/img.png?width=748&amp;amp;height=533&amp;amp;face=0_0_748_533&quot;&gt;&lt;a href=&quot;https://white-polarbear.tistory.com/100&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://white-polarbear.tistory.com/100&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/ftzDt/hyUPz60pW4/wZtz4Ff4tEmv9hERe1CsxK/img.png?width=553&amp;amp;height=505&amp;amp;face=0_0_553_505,https://scrap.kakaocdn.net/dn/CESIy/hyUPDVSfS9/RGnC3O5m90a20TnGjiGID0/img.png?width=553&amp;amp;height=505&amp;amp;face=0_0_553_505,https://scrap.kakaocdn.net/dn/qUKbe/hyUPBcG53V/N5nEtSC3itWDs7wayzMSzk/img.png?width=748&amp;amp;height=533&amp;amp;face=0_0_748_533');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Hierarchical LAN design Model (with. 네트워크 디자인)&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;● Hierarchical LAN design Model 소개 Hierarchical LAN design Model (계층적 LAN 디자인)은 엔터프라이즈 네트워크 아키텍처를 모듈형으로 나누어 각각의 모듈에서 기능을 수행하는 구조를 의미 합니다. 계층&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;white-polarbear.tistory.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=Do6G9w_DjJ4&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://www.youtube.com/watch?v=Do6G9w_DjJ4&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://networkdirection.net/articles/routingandswitching/vxlanoverview/&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://networkdirection.net/articles/routingandswitching/vxlanoverview/&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1706183930626&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;VXLAN Overview - Network Direction&quot; data-og-description=&quot;VxLAN provides a way to get a layer-2 network to run over the top of a layer-3 network. This improves scalability and network designs&quot; data-og-host=&quot;networkdirection.net&quot; data-og-source-url=&quot;https://networkdirection.net/articles/routingandswitching/vxlanoverview/&quot; data-og-url=&quot;https://networkdirection.net/articles/routingandswitching/vxlanoverview/&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/sCyoQ/hyVb97m1iR/qy4kYWkJ9YCvxoPryYgMKK/img.jpg?width=1024&amp;amp;height=295&amp;amp;face=0_0_1024_295,https://scrap.kakaocdn.net/dn/bvGQcg/hyVcdu90nq/KkKQ5LpSBWkzUopDnSaF5K/img.jpg?width=1024&amp;amp;height=295&amp;amp;face=0_0_1024_295,https://scrap.kakaocdn.net/dn/dloFta/hyU8WBQVcq/ifLBKh2UTgWwKJW23VVr0K/img.png?width=465&amp;amp;height=397&amp;amp;face=0_0_465_397&quot;&gt;&lt;a href=&quot;https://networkdirection.net/articles/routingandswitching/vxlanoverview/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://networkdirection.net/articles/routingandswitching/vxlanoverview/&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/sCyoQ/hyVb97m1iR/qy4kYWkJ9YCvxoPryYgMKK/img.jpg?width=1024&amp;amp;height=295&amp;amp;face=0_0_1024_295,https://scrap.kakaocdn.net/dn/bvGQcg/hyVcdu90nq/KkKQ5LpSBWkzUopDnSaF5K/img.jpg?width=1024&amp;amp;height=295&amp;amp;face=0_0_1024_295,https://scrap.kakaocdn.net/dn/dloFta/hyU8WBQVcq/ifLBKh2UTgWwKJW23VVr0K/img.png?width=465&amp;amp;height=397&amp;amp;face=0_0_465_397');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;VXLAN Overview - Network Direction&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;VxLAN provides a way to get a layer-2 network to run over the top of a layer-3 network. This improves scalability and network designs&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;networkdirection.net&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://support.huawei.com/enterprise/en/doc/EDOC1100023542?section=j017&amp;amp;topicName=layer-2-mac-address-learning-and-bum-packet-forwarding&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://support.huawei.com/enterprise/en/doc/EDOC1100023542?section=j017&amp;amp;topicName=layer-2-mac-address-learning-and-bum-packet-forwarding&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1706183932265&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;https://support.huawei.com/enterprise/en/doc/EDOC1100023542?section=j017&amp;amp;topicName=layer-2-mac-address-learning-and-bum-packet-forwarding&quot; data-og-description=&quot;&quot; data-og-host=&quot;support.huawei.com&quot; data-og-source-url=&quot;https://support.huawei.com/enterprise/en/doc/EDOC1100023542?section=j017&amp;amp;topicName=layer-2-mac-address-learning-and-bum-packet-forwarding&quot; data-og-url=&quot;https://support.huawei.com/enterprise/en/doc/EDOC1100023542?section=j017&amp;amp;topicName=layer-2-mac-address-learning-and-bum-packet-forwarding&quot; data-og-image=&quot;&quot;&gt;&lt;a href=&quot;https://support.huawei.com/enterprise/en/doc/EDOC1100023542?section=j017&amp;amp;topicName=layer-2-mac-address-learning-and-bum-packet-forwarding&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://support.huawei.com/enterprise/en/doc/EDOC1100023542?section=j017&amp;amp;topicName=layer-2-mac-address-learning-and-bum-packet-forwarding&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url();&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;https://support.huawei.com/enterprise/en/doc/EDOC1100023542?section=j017&amp;amp;topicName=layer-2-mac-address-learning-and-bum-packet-forwarding&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;support.huawei.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://www.cisco.com/c/dam/global/ko_kr/partners/assets/partner-webinar-ndfc-ndi-update.pdf&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://www.cisco.com/c/dam/global/ko_kr/partners/assets/partner-webinar-ndfc-ndi-update.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc3031&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://datatracker.ietf.org/doc/html/rfc3031&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1706611597594&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;RFC 3031: Multiprotocol Label Switching Architecture&quot; data-og-description=&quot;This document specifies the architecture for Multiprotocol Label Switching (MPLS). [STANDARDS-TRACK]&quot; data-og-host=&quot;datatracker.ietf.org&quot; data-og-source-url=&quot;https://datatracker.ietf.org/doc/html/rfc3031&quot; data-og-url=&quot;https://datatracker.ietf.org/doc/html/rfc3031&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/FiQAM/hyVb9U274U/V8nxzsN2BB5cN6KCQ6Ppb1/img.png?width=1200&amp;amp;height=630&amp;amp;face=0_0_1200_630&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc3031&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://datatracker.ietf.org/doc/html/rfc3031&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/FiQAM/hyVb9U274U/V8nxzsN2BB5cN6KCQ6Ppb1/img.png?width=1200&amp;amp;height=630&amp;amp;face=0_0_1200_630');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;RFC 3031: Multiprotocol Label Switching Architecture&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;This document specifies the architecture for Multiprotocol Label Switching (MPLS). [STANDARDS-TRACK]&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;datatracker.ietf.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/rfc4221/&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://datatracker.ietf.org/doc/rfc4221/&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1706611601078&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;RFC 4221: Multiprotocol Label Switching (MPLS) Management Overview&quot; data-og-description=&quot;A range of Management Information Base (MIB) modules has been developed to help model and manage the various aspects of Multiprotocol Label Switching (MPLS) networks. These MIB modules are defined in separate documents that focus on the specific areas of r&quot; data-og-host=&quot;datatracker.ietf.org&quot; data-og-source-url=&quot;https://datatracker.ietf.org/doc/rfc4221/&quot; data-og-url=&quot;https://datatracker.ietf.org/doc/rfc4221/&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/cKfsQL/hyVccqHnwJ/loJAziAE2d3q95CYBXdzVk/img.png?width=1200&amp;amp;height=630&amp;amp;face=0_0_1200_630&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/rfc4221/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://datatracker.ietf.org/doc/rfc4221/&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/cKfsQL/hyVccqHnwJ/loJAziAE2d3q95CYBXdzVk/img.png?width=1200&amp;amp;height=630&amp;amp;face=0_0_1200_630');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;RFC 4221: Multiprotocol Label Switching (MPLS) Management Overview&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;A range of Management Information Base (MIB) modules has been developed to help model and manage the various aspects of Multiprotocol Label Switching (MPLS) networks. These MIB modules are defined in separate documents that focus on the specific areas of r&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;datatracker.ietf.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc4760&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://datatracker.ietf.org/doc/html/rfc4760&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1706611607640&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;RFC 4760: Multiprotocol Extensions for BGP-4&quot; data-og-description=&quot;This document defines extensions to BGP-4 to enable it to carry routing information for multiple Network Layer protocols (e.g., IPv6, IPX, L3VPN, etc.). The extensions are backward compatible - a router that supports the extensions can interoperate with a &quot; data-og-host=&quot;datatracker.ietf.org&quot; data-og-source-url=&quot;https://datatracker.ietf.org/doc/html/rfc4760&quot; data-og-url=&quot;https://datatracker.ietf.org/doc/html/rfc4760&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/cao7TZ/hyVb9tXSIT/Ceh4Im9dN8eM3x5ag1CxI1/img.png?width=1200&amp;amp;height=630&amp;amp;face=0_0_1200_630&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc4760&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://datatracker.ietf.org/doc/html/rfc4760&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/cao7TZ/hyVb9tXSIT/Ceh4Im9dN8eM3x5ag1CxI1/img.png?width=1200&amp;amp;height=630&amp;amp;face=0_0_1200_630');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;RFC 4760: Multiprotocol Extensions for BGP-4&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;This document defines extensions to BGP-4 to enable it to carry routing information for multiple Network Layer protocols (e.g., IPv6, IPX, L3VPN, etc.). The extensions are backward compatible - a router that supports the extensions can interoperate with a&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;datatracker.ietf.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://support.huawei.com/enterprise/en/doc/EDOC1100171957&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://support.huawei.com/enterprise/en/doc/EDOC1100171957&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1706612859003&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;https://support.huawei.com/enterprise/en/doc/EDOC1100171957&quot; data-og-description=&quot;&quot; data-og-host=&quot;support.huawei.com&quot; data-og-source-url=&quot;https://support.huawei.com/enterprise/en/doc/EDOC1100171957&quot; data-og-url=&quot;https://support.huawei.com/enterprise/en/doc/EDOC1100171957&quot; data-og-image=&quot;&quot;&gt;&lt;a href=&quot;https://support.huawei.com/enterprise/en/doc/EDOC1100171957&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://support.huawei.com/enterprise/en/doc/EDOC1100171957&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url();&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;https://support.huawei.com/enterprise/en/doc/EDOC1100171957&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;support.huawei.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc4684&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://datatracker.ietf.org/doc/html/rfc4684&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1706670410906&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;RFC 4684: Constrained Route Distribution for Border Gateway Protocol/MultiProtocol Label Switching (BGP/MPLS) Internet Protocol &quot; data-og-description=&quot;This document defines Multi-Protocol BGP (MP-BGP) procedures that allow BGP speakers to exchange Route Target reachability information. This information can be used to build a route distribution graph in order to limit the propagation of Virtual Private Ne&quot; data-og-host=&quot;datatracker.ietf.org&quot; data-og-source-url=&quot;https://datatracker.ietf.org/doc/html/rfc4684&quot; data-og-url=&quot;https://datatracker.ietf.org/doc/html/rfc4684&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/YYjlK/hyVf5p4B7v/5SkKtTjCqkV5hI3951PrKk/img.png?width=1200&amp;amp;height=630&amp;amp;face=0_0_1200_630&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc4684&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://datatracker.ietf.org/doc/html/rfc4684&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/YYjlK/hyVf5p4B7v/5SkKtTjCqkV5hI3951PrKk/img.png?width=1200&amp;amp;height=630&amp;amp;face=0_0_1200_630');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;RFC 4684: Constrained Route Distribution for Border Gateway Protocol/MultiProtocol Label Switching (BGP/MPLS) Internet Protocol&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;This document defines Multi-Protocol BGP (MP-BGP) procedures that allow BGP speakers to exchange Route Target reachability information. This information can be used to build a route distribution graph in order to limit the propagation of Virtual Private Ne&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;datatracker.ietf.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://blog.naver.com/goduck2/220111709554&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://blog.naver.com/goduck2/220111709554&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1710166436708&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;[오리뎅이의 LAN 통신 이야기 - 1] 삼테이블을 정복하면 LAN 통신은 끝!!&quot; data-og-description=&quot;안녕하세요? 오리뎅이입니다. &amp;nbsp; 우리나라 말에 &amp;quot;시작이 반&amp;quot;이라는 말이 있습니다. 뭔가를 시작하기가 ...&quot; data-og-host=&quot;blog.naver.com&quot; data-og-source-url=&quot;https://blog.naver.com/goduck2/220111709554&quot; data-og-url=&quot;https://blog.naver.com/goduck2/220111709554&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/baiKPZ/hyVxtZTwmR/yzyTZKapIDggnNH19NLqi0/img.png?width=613&amp;amp;height=369&amp;amp;face=0_0_613_369&quot;&gt;&lt;a href=&quot;https://blog.naver.com/goduck2/220111709554&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://blog.naver.com/goduck2/220111709554&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/baiKPZ/hyVxtZTwmR/yzyTZKapIDggnNH19NLqi0/img.png?width=613&amp;amp;height=369&amp;amp;face=0_0_613_369');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;[오리뎅이의 LAN 통신 이야기 - 1] 삼테이블을 정복하면 LAN 통신은 끝!!&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;안녕하세요? 오리뎅이입니다. &amp;nbsp; 우리나라 말에 &quot;시작이 반&quot;이라는 말이 있습니다. 뭔가를 시작하기가 ...&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;blog.naver.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://www.simulationexams.com/Blog/2018/04/17/spanning-tree-protocol-stp-in-local-area-networks-lans/&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://www.simulationexams.com/Blog/2018/04/17/spanning-tree-protocol-stp-in-local-area-networks-lans/&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1710167481643&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;Spanning Tree Protocol (STP) in Local Area Networks (LANs)&quot; data-og-description=&quot;What is Spanning Tree Protocol: In computer networking, data packets are forwarded from one network node to another as the packet travels from source to destina&quot; data-og-host=&quot;www.simulationexams.com&quot; data-og-source-url=&quot;https://www.simulationexams.com/Blog/2018/04/17/spanning-tree-protocol-stp-in-local-area-networks-lans/&quot; data-og-url=&quot;https://www.simulationexams.com/Blog/2018/04/17/spanning-tree-protocol-stp-in-local-area-networks-lans/&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/ctAIfT/hyVxu5AYgb/7aPtJiXKdawwCNGJs8xmT0/img.gif?width=300&amp;amp;height=281&amp;amp;face=0_0_300_281,https://scrap.kakaocdn.net/dn/jYIqs/hyVxArb4Ye/KQsk7sDB9op1Nvj6qJ1850/img.gif?width=300&amp;amp;height=281&amp;amp;face=0_0_300_281,https://scrap.kakaocdn.net/dn/4cA03/hyVxBcyySz/SvzhKaG1Akw3WKUxf8Jir1/img.jpg?width=750&amp;amp;height=459&amp;amp;face=0_0_750_459&quot;&gt;&lt;a href=&quot;https://www.simulationexams.com/Blog/2018/04/17/spanning-tree-protocol-stp-in-local-area-networks-lans/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://www.simulationexams.com/Blog/2018/04/17/spanning-tree-protocol-stp-in-local-area-networks-lans/&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/ctAIfT/hyVxu5AYgb/7aPtJiXKdawwCNGJs8xmT0/img.gif?width=300&amp;amp;height=281&amp;amp;face=0_0_300_281,https://scrap.kakaocdn.net/dn/jYIqs/hyVxArb4Ye/KQsk7sDB9op1Nvj6qJ1850/img.gif?width=300&amp;amp;height=281&amp;amp;face=0_0_300_281,https://scrap.kakaocdn.net/dn/4cA03/hyVxBcyySz/SvzhKaG1Akw3WKUxf8Jir1/img.jpg?width=750&amp;amp;height=459&amp;amp;face=0_0_750_459');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Spanning Tree Protocol (STP) in Local Area Networks (LANs)&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;What is Spanning Tree Protocol: In computer networking, data packets are forwarded from one network node to another as the packet travels from source to destina&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;www.simulationexams.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>System Engineering/Network</category>
      <category>VXLAN</category>
      <author>Hopulence</author>
      <guid isPermaLink="true">https://hopulence.tistory.com/41</guid>
      <comments>https://hopulence.tistory.com/41#entry41comment</comments>
      <pubDate>Tue, 10 Oct 2023 00:06:31 +0900</pubDate>
    </item>
    <item>
      <title>[커널이야기] TCP Keepalive와 Retransmission</title>
      <link>https://hopulence.tistory.com/40</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;a href=&quot;https://hopulence.tistory.com/39&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;이전 내용(TCP handshake와 TIME_WAIT 소켓)&lt;/a&gt;에 이어 작성합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;목차&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- TCP keepalive&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- Keepalive와 좀비 커넥션&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- TCP keepalive vs HTTP keepalive&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- Keepalive와 Load Balancer - keepalive로 해결 가능한 문제&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- TCP Retransmission과 RTO&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 재전송과 커널 파라미터 그리고 tcp_write_timeout() 함수&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- RTO_MIN 변경하기&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- TCP 재전송과 Application Timeout&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&amp;nbsp;1. TCP keepalive&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;keepalive란 한 번 맺은 세션의 요청이 끝나더라도 타이머에 따라 아주 작은 사이즈의 패킷을 보내어 연결을 유지해주는 기능입니다. 클라이언트의 잦은 요청으로 세션을 맺고 끊는 횟수가 많을 경우, 그 연결을 끊지 않고 유지하고 지속적으로 요청을 처리하여 &lt;u&gt;불필요한 handshake와 TIME_WAIT 소켓을 줄이고 서버의 성능을 향상&lt;/u&gt;시킬 수 있습니다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;600&quot; data-origin-height=&quot;519&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/8MAuF/btsvHeNSfpm/ney3VaRKnVPcf14MRbhk9k/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/8MAuF/btsvHeNSfpm/ney3VaRKnVPcf14MRbhk9k/img.png&quot; data-alt=&quot;keepalive 패킷 flow&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/8MAuF/btsvHeNSfpm/ney3VaRKnVPcf14MRbhk9k/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F8MAuF%2FbtsvHeNSfpm%2Fney3VaRKnVPcf14MRbhk9k%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;600&quot; height=&quot;519&quot; data-origin-width=&quot;600&quot; data-origin-height=&quot;519&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;keepalive 패킷 flow&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;netstat 명령어로 keepalive의 타이머 시간을 확인할 수 있으며, 커널에서는 TCP keepalive와 관련하여 아래 3개지 파라미터를 제공합니다.&lt;/p&gt;
&lt;pre id=&quot;code_1695661141891&quot; class=&quot;shell&quot; style=&quot;background-color: #f8f8f8; color: #383a42; text-align: start;&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;root@server:~$ netstat -napo
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name     Timer
tcp        0      0 127.0.0.1:6010          0.0.0.0:*               LISTEN      -                    off (0.00/0/0)
...																										# 타이머의 남은 시간
tcp        0      0 192.168.0.10:22         192.168.1.111:59319     ESTABLISHED -                    keepalive (160.88/0/0)
tcp        0      0 192.168.0.10:34110      192.168.0.11:22         ESTABLISHED 70390/ssh            keepalive (4818.35/0/0)
tcp        0      0 192.168.0.10:22         192.168.1.111:53835     ESTABLISHED -                    keepalive (539.13/0/0)
tcp        0      0 192.168.0.10:55328      192.168.0.11:22         ESTABLISHED 69385/ssh            keepalive (1599.51/0/0)&lt;/code&gt;&lt;/pre&gt;
&lt;pre id=&quot;code_1695661141891&quot; class=&quot;shell&quot; style=&quot;background-color: #f8f8f8; color: #383a42;&quot; data-ke-type=&quot;codeblock&quot; data-ke-language=&quot;shell&quot;&gt;&lt;code&gt;root@\server:~# sysctl -a | grep keepalive
net.ipv4.tcp_keepalive_intvl = 75	# keepalive 패킷 재전송 주기
net.ipv4.tcp_keepalive_probes = 9	# keepalive 패킷의 최대 전송 횟수
net.ipv4.tcp_keepalive_time = 7200	# keepalive 소켓 유지 시간&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&amp;nbsp;2. Keepalive와 좀비 커넥션&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;keepalive를 사용하는 가장 큰 이점은 좀비 커넥션을 방지하는 것입니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;아래 그림처럼 클라이언트와 서버가 TCP 세션을 맺고 있다가 어떤 이유로 한 쪽(클라이언트)이 FIN이나 RST를 받지 못한 채로 상대방(서버)이 응답 불가가 된 경우, 클라리언트는 소켓을 점유한 채 일정 시간을 기다려야합니다. 그러나 keepalive 기능을 사용하여 keepalive 패킷으로 보내고, 타이머와 probe 횟수 이내에 응답을 받지 못하면 RST를 보내고 소켓을 닫아 이를 방지할 수 있습니다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1361&quot; data-origin-height=&quot;508&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cCqwkx/btsv7o4u6V4/tHmkjYsErtym3YAvptxutK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cCqwkx/btsv7o4u6V4/tHmkjYsErtym3YAvptxutK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cCqwkx/btsv7o4u6V4/tHmkjYsErtym3YAvptxutK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcCqwkx%2Fbtsv7o4u6V4%2FtHmkjYsErtym3YAvptxutK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1361&quot; height=&quot;508&quot; data-origin-width=&quot;1361&quot; data-origin-height=&quot;508&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;이처럼 TCP keepalive를 사용하면 어플리케이션에서 별도로 구현하지 않더라도 커널 단에서 세션을 관리할 수 있습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&amp;nbsp;3. TCP keepalive vs HTTP keepalive&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;apache나 nginx와 같은 어플리케이션에서도 HTTP/1.1에서 지원하는 keepalive 설정이 있습니다. 이는 End-to-End 연결을 위해 작은 사이즈의 패킷을 주기적으로 주고 받아 연결과 소켓을 확인하는 TCP keepalive와 달리, HTTP Request에 keep-alive 헤더를 추가해서 보내면 서버는 timoue과 max를 찍어서 응답합니다.&lt;/p&gt;
&lt;pre id=&quot;code_1696063908058&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# keep-alive 헤더를 추가한 요청에 대한 Apache 서버의 응답
root@server:~$ curl -kv {URL} -H &quot;Connection: keep-alive&quot;
...
&amp;lt; HTTP/1.1 200 OK
&amp;lt; Date: Fri, 29 Sep 2023 23:44:20 GMT
&amp;lt; Server: Apache
...
&amp;lt; Keep-Alive: timeout=60, max=100
&amp;lt; Connection: Keep-Alive
&amp;lt; Content-Type: application/json; ...&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;* &lt;b&gt;timeout&lt;/b&gt; : 커넥션이 몇 초 동안 유지 될 것인지&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;* &lt;b&gt;max&lt;/b&gt; : 커넥션을 통해 주고 받을 수 있는 request의 최대 갯수.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;위의 경우 60초 동안 연결을 유지하고, 다음 60초 이내 요청이 없다면 연결을 종료합니다. 또한 60초 이내더라도 100개의 트랜잭션을 초과하게 되며 연결을 종료합니다.&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;TCP keepalive &amp;gt; HTTP keepalive인 경우에도 HTTP keepalive에 의해 연결을 끊어버립니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&amp;nbsp;* Keepalive와 Load Balancer - keepalive로 해결 가능한 문제&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp; 로드 밸런서의 여러 구조 중 Inline 구조와 DSR 구조가 있습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;먼저 &lt;b&gt;Inline 구조&lt;/b&gt;의 경우 클라이언트의 요청과 서버의 응답모두 로드 밸런서를 거치는 구조입니다. Inbound와 Outbound 패킷 모두 로드 밸런서를 거치기 때문에 &lt;u&gt;트래픽 부하가 크지만, 패킷의 모니터링과 필터링에 유리&lt;/u&gt;합니다. 삼중화로 구성된 쿠버네티스의 마스터 노드에서 kube-apiserver의 로드 밸런싱을 위해 keepalived와 haproxy처럼 소규모의 management traffic이 흐르면서 모니터링이 중요한 시스템에 적합합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;b&gt;DSR(Direct Server Return) 구조&lt;/b&gt;는 클라이언트의 요청에 대한 서버의 응답이 로드 밸런서를 거치지 않고 직접 전달됩니다. 로&lt;u&gt;드 밸런서와 서버가 같은 네트워크에 속해야 한다는 제약&lt;/u&gt;이 있지만, &lt;u&gt;로드 밸런서를 거치는 패킷이 줄어드는 만큼 트래픽 부하를 줄일 수 있어 대규모 서비스에서 많이 사용&lt;/u&gt;됩니다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1525&quot; data-origin-height=&quot;339&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dELlxJ/btsxgZuqfdX/GCD2rscs2P0D7mzNKsBxB1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dELlxJ/btsxgZuqfdX/GCD2rscs2P0D7mzNKsBxB1/img.png&quot; data-alt=&quot;로드 밸런서의 DSR과 Inline 구조&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dELlxJ/btsxgZuqfdX/GCD2rscs2P0D7mzNKsBxB1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdELlxJ%2FbtsxgZuqfdX%2FGCD2rscs2P0D7mzNKsBxB1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1525&quot; height=&quot;339&quot; data-origin-width=&quot;1525&quot; data-origin-height=&quot;339&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;로드 밸런서의 DSR과 Inline 구조&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;여기서 keepalive로 해결 가능한 문제는 DSR 구조에서 발생합니다. DSR 구조에서 사용자가 적은 새벽시간대에 &lt;u&gt;일정 시간동안 연결된 세션에 대해서 로드밸런서로 패킷이 흐르지 않으면 로드 밸런서의 idle timeout에 의해 세션 테이블이 삭제&lt;/u&gt;됩니다. &lt;u&gt;클라이언트 입장에서는 여전히 세션이 유지되고 있기 때문에 새로운 요청을 보내지만, 이미 로드 밸런서의 세션 테이블에서 삭제되었으므로 새로운 세션으로 간주&lt;/u&gt;됩니다. 만약 &lt;u&gt;새로운 요청이 기존과 다른 서버로 요청이 유입되었다면 서버 입장에서는 TCP handshake 과정이 없었던 세션의 데이터 요청으로 간주하고 RST를 보내게 됩니다.&lt;/u&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;743&quot; data-origin-height=&quot;315&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/AJm9f/btsw8kfahhz/kuToaBS6Nz1na32asDuKWK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/AJm9f/btsw8kfahhz/kuToaBS6Nz1na32asDuKWK/img.png&quot; data-alt=&quot;로드 밸런서의 세션 테이블 삭제 이후의 요청&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/AJm9f/btsw8kfahhz/kuToaBS6Nz1na32asDuKWK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FAJm9f%2Fbtsw8kfahhz%2FkuToaBS6Nz1na32asDuKWK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;710&quot; height=&quot;301&quot; data-origin-width=&quot;743&quot; data-origin-height=&quot;315&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;로드 밸런서의 세션 테이블 삭제 이후의 요청&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;s&gt;nginx, istio, haproxy에서 session idle timeout (작성중입니다.)&lt;/s&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&amp;nbsp;2. TCP Retransmission과 RTO (Retransmission Timeout)&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;TCP는 송신자가 보낸 패킷에 대한 ACK 패킷을 받아아 통신이 정상적으로 이루어졌다고 간주합니다. 그래서 송신자가 패킷을 보내고 거기에 대한 ACK를 받지 못하면 패킷이 유실된 것으로 판단하고 패킷을 재전송합니다. 여기서 &lt;u&gt;ACK를 기다리는 타이머에 대한 값을 RTO&lt;/u&gt;라고 합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;722&quot; data-origin-height=&quot;579&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bhai91/btsxgTOyED6/Fvo7bCyHxN9uhzo1ozve1k/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bhai91/btsxgTOyED6/Fvo7bCyHxN9uhzo1ozve1k/img.png&quot; data-alt=&quot;TCP 재전송 과정&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bhai91/btsxgTOyED6/Fvo7bCyHxN9uhzo1ozve1k/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbhai91%2FbtsxgTOyED6%2FFvo7bCyHxN9uhzo1ozve1k%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;722&quot; height=&quot;579&quot; data-origin-width=&quot;722&quot; data-origin-height=&quot;579&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;TCP 재전송 과정&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;RTO에는 RTO와 initRTO 두 가지가 있습니다. 일반적인 RTO는 End-to-End 간의 RTT(Round Trip Time)를 기준으로 설정됩니다. 그리고 initRTO는 TCP handshake가 발생하는 첫 SYN 패킷에 대한 RTO를 의미합니다. 세션을 맺는 첫 단계이기에 RTT를 알 수 없기 때문에 리눅스에서는 initRTO가 1초로 하드 코딩되어 있습니다.&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;그리고 TCP 세션의 RTT와 RTO 값은 ss 명령어로 확인할 수 있습니다.&lt;/p&gt;
&lt;pre id=&quot;code_1696656544823&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;root@server :~&amp;gt; ss -i 
tcp   ESTAB  0      0                                                                             192.168.100.100:60990                   192.168.100.101:etcd-server
         cubic wscale:11,11 rto:204 rtt:0.082/0.028 ato:40 mss:1024 pmtu:9000 rcvmss:1340 advmss:8960 cwnd:10 bytes_sent:1778 bytes_acked:1779 bytes_received:16992591157 segs_out:165276289 segs_in:165458279 data_segs_out:4 data_segs_i
n:165458274 send 999Mbps lastsnd:2573705048 lastrcv:148 lastack:376 pacing_rate 2Gbps delivery_rate 193Mbps delivered:5 app_limited rcv_rtt:3217.79 rcv_space:117305 rcv_ssthresh:415696 minrtt:0.049
tcp   ESTAB      0      0                                                                               127.0.0.1:etcd-client                  127.0.0.1:50640
         cubic wscale:11,11 rto:212 rtt:9.523/15.989 ato:40 mss:1024 pmtu:65535 rcvmss:1024 advmss:65495 cwnd:10 bytes_sent:81264639 bytes_retrans:85 bytes_acked:81264554 bytes_received:42645091 segs_out:1409087 segs_in:2090909 data
_segs_out:1408821 data_segs_in:686052 send 8.6Mbps lastsnd:2724 lastrcv:2724 lastack:2680 pacing_rate 17.2Mbps delivery_rate 1.37Gbps delivered:1408822 app_limited busy:9396380ms retrans:0/2 dsack_dups:2 rcv_rtt:387232 rcv_space:657
62 rcv_ssthresh:73687 minrtt:0.009&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;위의 두 세션은 204ms와 212ms 동안 ACK를 받지 못하면 패킷을 재전송하게 됩니다. 그리고 rtt로 출력된 값은&lt;u&gt; 평균값/평균편차&lt;/u&gt;를 의미합니다. 두 세션이 하나의 패킷을 주고 받기위해 평균적으로 0.082ms와 9.523ms가 소요됩니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&amp;nbsp;* RTO 값은 재전송 될 때마다 2배씩 증가합니다.&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;3. 재전송과 커널 파라미터 그리고 tcp_write_timeout() 함수&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;리눅스 커널에서는 TCP 재전송에 대해 아래 5가지의 파라미터를 제공합니다.&lt;/p&gt;
&lt;pre id=&quot;code_1696657335917&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;root@server:~ # sysctl -a | grep retries
net.ipv4.tcp_syn_retries = 8	# SYN에 대한 retry 횟수
net.ipv4.tcp_synack_retries = 5	# 상대방의 SYN에 대한 SYN+ACK의 retry 횟수
net.ipv4.tcp_orphan_retries = 0	# FIN_WAIT1 상태에서 FIN에 대한 retry 횟수
net.ipv4.tcp_retries1 = 3		# IP Layer에 문제가 있는지 확인(Soft Threshold)
net.ipv4.tcp_retries2 = 15		# 통신이 가능한지 확인(Hard Threshold) -&amp;gt; 해당 횟수 초과 시 연결 종료&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;그리고 커널 파라미터의 재전송 횟수는 tcp_write_timeout() 함수에 반영됩니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;tcp_timer.c에 정의된 tcp_write_timeout() 함수는 위의 커널 파라미터를 반영하여 재전송 횟수와 타이머를 기준으로 소켓을 종료할지 말지를 결정합니다.&lt;/p&gt;
&lt;pre id=&quot;code_1696845941784&quot; class=&quot;cpp&quot; style=&quot;background-color: #f8f8f8; color: #383a42;&quot; data-ke-type=&quot;codeblock&quot; data-ke-language=&quot;cpp&quot;&gt;&lt;code&gt;// net/ipv4/tcp_timer.c

static int tcp_write_timeout(struct sock *sk) {
	struct inet_connection_sock *icsk = inet_csk(sk);
	struct tcp_sock *tp = tcp_sk(sk);
	struct net *net = sock_net(sk);
	bool expired = false, do_reset;
	int retry_until;
    
    if ((1 &amp;lt;&amp;lt; sk-&amp;gt;sk_state) &amp;amp; (TCPF_SYN_SENT | TCPF_SYN_RECV)) {
		if (icsk-&amp;gt;icsk_retransmits)
			__dst_negative_advice(sk);
		retry_until = icsk-&amp;gt;icsk_syn_retries ? :
			READ_ONCE(net-&amp;gt;ipv4.sysctl_tcp_syn_retries);
	// 소켓의 상태가 SYN_SENT 또는 SYN_RECV 상태일 경우 tcp_syn_retries 값 반영
            
		expired = icsk-&amp;gt;icsk_retransmits &amp;gt;= retry_until;
	// retry 횟수보다 재전송 횟수가 크거나 같다면 expired로 판단.
	} else {
		if (retransmits_timed_out(sk, READ_ONCE(net-&amp;gt;ipv4.sysctl_tcp_retries1), 0)) {
	// retransmits_timed_out() : 패킷 전송 시간을 기준으로 ACK를 받기 전에 타이머가 만료되면, timeout으로 판단.
			
            tcp_mtu_probing(icsk, sk);
	// sysctl_tcp_mtu_probing 파라미터에 따라 동작 (default = disabled)
	// 재전송 전에 두 종단 사이 Network MTU에 의한 Fragment 방지를 위한 MSS negotiation
			
            __dst_negative_advice(sk);
	// 목적지 IP와 next-hop 등이 포함된 dst_entry 구조체(destination cache)에 negative flag를 줌
		}

		retry_until = READ_ONCE(net-&amp;gt;ipv4.sysctl_tcp_retries2);
		if (sock_flag(sk, SOCK_DEAD)) {
			const bool alive = icsk-&amp;gt;icsk_rto &amp;lt; TCP_RTO_MAX;
	// 소켓이 죽은 상태에서 RTO의 값이 RTO_MAX(120s)보다 작다면 alive 상태로 간주.
            
			retry_until = tcp_orphan_retries(sk, alive);
	// orphan socket 상태이므로 sysctl_tcp_retries2 대신 tcp_orphan_retries 값 적용. 
			
            ...
		}
	}
	if (!expired)	// 
		expired = retransmits_timed_out(sk, retry_until, icsk-&amp;gt;icsk_user_timeout);
	...
    
	if (expired) {	
	// retry 횟수 이내에 ACK를 받지 못 했다면 에러 정보를 저장하고 소켓 종료
		tcp_write_err(sk);
		return 1;
	
    ...
}&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&amp;nbsp;* &lt;b&gt;tcp_synack_retries&lt;/b&gt; 와 SYN flooding&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;서버가 SYN을 받고 SYN+ACK를 전송하면 소켓은 SYN_RECV 상태가 됩니다. DDoS의 일종인 SYN flooding 공격은 3 way handshake 과정에서 클라이언트(좀비 단말)가 다량의 SYN을 보내고 서버의 SYN+ACK에 응답하지 않도록 합니다. 그러면 서버는 클라이언트로부터 받은 SYN에 SYN+ACK를 보내지만 ACK를 받지 못해 tcp_synack_retries에 정의된 횟수 만큼 SYN+ACK를 재전송하며 SYN_RECV 상태를 유지합니다. 이는 서버 리소스 고갈로 이어질 수 있으므로 tcp_synack_retries 값을 적절하게 줄이는 것이 좋습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&amp;nbsp;* tcp_orphan_retries와 orphan socket&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp; FIN_WAIT, TIME_WAIT 상태의 소켓들은 커널에 반환되어 정리되기를 기다리는 상태입니다. TCP 연결을 끊기 위한 4 way handshake 과정에서 &lt;u&gt;서버는 FIN을 보낸 이후에는 받는 패킷만 존재&lt;/u&gt;하므로 tcp_orphan_retries 값은 FIN_WAIT1 소켓에만 해당됩니다. 즉,&lt;u&gt; FIN_WAIT1 상태의 소켓이 바로 orphan socket&lt;/u&gt;입니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;726&quot; data-origin-height=&quot;424&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b6cBVa/btsxq4Qltty/DxkfNE8KvnQDpxwgFwbyP0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b6cBVa/btsxq4Qltty/DxkfNE8KvnQDpxwgFwbyP0/img.png&quot; data-alt=&quot;4 way handshake&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b6cBVa/btsxq4Qltty/DxkfNE8KvnQDpxwgFwbyP0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb6cBVa%2Fbtsxq4Qltty%2FDxkfNE8KvnQDpxwgFwbyP0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;726&quot; height=&quot;424&quot; data-origin-width=&quot;726&quot; data-origin-height=&quot;424&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;4 way handshake&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;tcp_orphan_retries 값은 tcp_orphan_retires() 함수에서 사용됩니다.&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1696844202123&quot; class=&quot;reasonml&quot; style=&quot;background-color: #f8f8f8; color: #383a42;&quot; data-ke-type=&quot;codeblock&quot; data-ke-language=&quot;cpp&quot;&gt;&lt;code&gt;// net/ipv4/tcp_timer.c

/**
 *  tcp_orphan_retries() - Returns maximal number of retries on an orphaned socket
 *  @sk:    Pointer to the current socket.
 *  @alive: bool, socket alive state
 */
static int tcp_orphan_retries(struct sock *sk, bool alive) {
	int retries = READ_ONCE(sock_net(sk)-&amp;gt;ipv4.sysctl_tcp_orphan_retries); /* May be zero. */

	/* We know from an ICMP that something is wrong. */
	if (sk-&amp;gt;sk_err_soft &amp;amp;&amp;amp; !alive)
		retries = 0;

	/* However, if socket sent something recently, select some safe number of retries.
    * 8 corresponds to &amp;gt;100 seconds with minimal RTO of 200msec. */
	if (retries == 0 &amp;amp;&amp;amp; alive)
		retries = 8;
	return retries;
}&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;tcp_orphan_retries 값이 0이고 소켓이 alive 상태이면 retry 값으로 8을, 파라미터의 값이 0이 아니라면 해당 값을 retry 값으로 반환합니다. 그리고 소켓의 alive 상태는 위의 tcp_write_timeout() 함수에 정의되어 있습니다. orphan socket 상태 소켓의 RTO 값이 TCP_RTO_MAX(120s)보다 크다면 0이 되어 False가 됩니다. 그러나 RTO가 120s보다 클 일은 거의 없기에 해당 값은 거의 항상 1이 됩니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;FIN_WAIT1 상태에서 지정된 횟수만큼 FIN을 재전송해서 ACK를 받지 못하면 FIN_WAIT2나 TIME_WAIT 상태로 변경 없이 바로 소켓을 회수합니다. 그렇기에 tcp_orphan_retries 값이 너무 작으면 FIN_WAIT1 소켓이 너무 빨리 정리 되어 상대편에서 소켓이 닫히지 않는 결과가 될 수 있습니다. 따라서 TIME_WAIT 상태가 유지되는 60초 정도가 될 수 있는 7 정도의 값으로 설정하는 것이 좋습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&amp;nbsp;* &lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc1122#page-100&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;tcp_retries1 vs tcp_retries2 (RFC-1122)&lt;/a&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;아래는 ubuntu에서 man tcp 명령어로 볼 수 있는 설명입니다.&lt;/p&gt;
&lt;pre id=&quot;code_1696831742784&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;tcp_retries1 (integer; default: 3; since Linux 2.2)
       The  number  of  times  TCP will attempt to retransmit a packet on an established connection normally, without the extra effort of getting the network layers involved.  Once we exceed this number of retransmits, we
       first have the network layer update the route if possible before each new retransmit.  The default is the RFC specified minimum of 3.

tcp_retries2 (integer; default: 15; since Linux 2.2)
       The maximum number of times a TCP packet is retransmitted in established state before giving up.  The default value is 15, which corresponds to a duration of approximately between 13 to 30 minutes, depending on the
       retransmission timeout.  The RFC 1122 specified minimum limit of 100 seconds is typically deemed too short.&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;b&gt;tcp_retries1(R1)은&lt;/b&gt; retransmits_timed_out() 함수를 통해 해당 connection이 timeout 된 것으로 판단되면 dst_negative_advice() 함수를 호출합니다. dst_negative_davice() 함수는 목적지 IP와 next-hop(gateway), timestamp 등의 정보를 담고 dst_entry 구조체(destination cache)에 negative flag를 줍니다. (Soft threshold)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;b&gt;tcp_retries2(R2)&lt;/b&gt;은 retry_until 값에 반영되어 retransmits_timed_out() 함수를 통해 connection이 expired인지 확인하는 용도로 사용됩니다. (Hard threshold)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;결과적으로 t&lt;u&gt;cp_retires2 값에 정의된 횟수를 초과해야 세션을 종료&lt;/u&gt;합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt; &amp;nbsp;* &lt;/b&gt;&lt;a style=&quot;color: #0070d1; text-align: start;&quot; href=&quot;https://datatracker.ietf.org/doc/html/rfc1122#page-51&quot;&gt;negative advice&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&amp;nbsp;4. RTO_MIN 변경하기&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;RTO_MIN의 값은 tcp.h에 200ms로 정의되어 있습니다. RTT가 작은 내부망 통신이라도 RTO값은 200ms보다 작아질 수 없습니다.&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1696659162676&quot; class=&quot;cpp&quot; style=&quot;background-color: #f8f8f8; color: #383a42; text-align: start;&quot; data-ke-type=&quot;codeblock&quot; data-ke-language=&quot;cpp&quot;&gt;&lt;code&gt;// include/net/tcp.h
#define TCP_RTO_MAX ((unsigned)(120*HZ))
#define TCP_RTO_MIN ((unsigned)(HZ/5))
/* HZ는 보통 1초 */
/* RTO_MAX = 120s &amp;amp; RTO_MIN = 200ms */&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;그러나 ip route 명령어의 rto_min 옵션을 통해 RTO_MIN 값을 변경할 수 있습니다. 세션별로는 바꿀 수 없으며 하나의 NIC 인터페이스를 기준으로 변경할 수 있습니다.&lt;/p&gt;
&lt;pre id=&quot;code_1696657225238&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;root@sevrer:~ # ip route change default via &amp;lt;GW&amp;gt; dev &amp;lt;DEVICE&amp;gt; rto_min 100ms&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;아래 명령어를 통해 eth0를 통해 연결된 ssh 세션의 rto 값이 156이 된 것을 확인할 수 있습니다.&lt;/p&gt;
&lt;pre id=&quot;code_1696859824456&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;root@sevrer:~ # ip route change 192.168.100.0/24 dev eth0 rto_min 150ms
root@sevrer:~ # ip route
...
192.168.100.0/24 dev eth0 proto kernel scope link src 192.168.100.104 rto_min lock 150ms
...&lt;/code&gt;&lt;/pre&gt;
&lt;pre id=&quot;code_1696859857156&quot; class=&quot;shell&quot; data-ke-language=&quot;shell&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;root@sevrer:~ # ss -i
tcp            ESTAB          0               0                                                                                                192.168.100.104:ssh                                         192.168.100.100:64912
         cubic wscale:11,11 rto:156 rtt:0.05/0.007 ato:60 mss:8960 pmtu:9000 rcvmss:1024 advmss:8960 cwnd:10 bytes_sent:30765 bytes_acked:30765 bytes_received:20449 segs_out:473 segs_in:856 data_segs_out:459 data_segs_in:402 send 14.3Gbps lastsnd:24 lastrcv:24 lastack:24 pacing_rate 28.2Gbps delivery_rate 3.98Gbps delivered:460 app_limited busy:228ms rcv_space:56576 rcv_ssthresh:56576 minrtt:0.038&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;외부에 노출된 프론트 서버나 AWS나 Azure와 같은 public cloud를 함께 사용하는 시스템의 경우 기본값으로 정해진 200ms를 따르는 것이 좋겠지만, 내부적으로 통신하는 서버는 rtt가 매우 짧기 때문에 그에 상응하는 rto_min 값을 설정하여 서비스 품질을 높일 수 있습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;5. TCP 재전송과 Application Timeout&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;시스템에서 TCP 재전송이 발생했을 경우 어플리케이션의 타임아웃 임계치에 따라 어플리케이션 타임아웃이 발생할 수도, 하지 않을 수도 있습니다.&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;어플리케이션에서 발생할 수 있는 타임아웃에는 크게 Connection T/O과 Read T/O 두 가지가 있습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;b&gt;Connection Timeout&lt;/b&gt;은 3&lt;u&gt; way handshake 과정에서 SYN이나 SYN+ACK 패킷 유실에 의한 timeout&lt;/u&gt;입니다. initRTO는 커널에 1초로 하드 코딩되어 있으므로, 2번의 재전송을 보장할 수 있는 3초(1+2)이상의 값이 좋겠습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;b&gt;Read Timeout&lt;/b&gt;은 이미 맺어진 세션에서의 &lt;u&gt;데이터 읽기 작업에 대한 timeout&lt;/u&gt;입니다. 두 종단 간의 rtt 값과 서버에서 요청을 처리하는게 소요되는 시간을 고려해야 합니다. 가령, 클라이언트 A와 서버 B의 rtt 값이 30ms이고 서버가 이 요청을 처리하는데 200ms가 소요된다고 가정했을 때, 클라이언트와 서버가 통신하기 위해 필요한 최소 시간은 230ms가 됩니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;여기에 커널의 최소 RTO 값은 200ms를 기준으로 1번의 재전송을 허용한다고 했을 때, 클라이언트가 서버로부터 요청에 대한 응답을 받기까지 소요되는 시간은 430ms가 됩니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;따라서 Read T/O의 값은 rtt의 값과 요청 처리 시간을 고려하여 설정해야 합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;836&quot; data-origin-height=&quot;492&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bsMG4v/btsxq4QTHAD/B3OW05X7P5XBZKYTWPmW3K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bsMG4v/btsxq4QTHAD/B3OW05X7P5XBZKYTWPmW3K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bsMG4v/btsxq4QTHAD/B3OW05X7P5XBZKYTWPmW3K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbsMG4v%2Fbtsxq4QTHAD%2FB3OW05X7P5XBZKYTWPmW3K%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;777&quot; height=&quot;457&quot; data-origin-width=&quot;836&quot; data-origin-height=&quot;492&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;Reference&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://jw910911.tistory.com/35&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;https://jw910911.tistory.com/35&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1696064952178&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;HTTP 와 TCP의 Keep-Alive&quot; data-og-description=&quot;HTTP의 Keep-Alive HTTP 프로토콜의 Keep-Alive는 Http의 Header의 일종입니다.. 이는 HTTP/1.0에서 지원하지 않던 지속 커넥션을 가능하게 하기 위해서 쓰였습니다. 그렇다면 지속 커넥션이 뭔지부터 알아보&quot; data-og-host=&quot;jw910911.tistory.com&quot; data-og-source-url=&quot;https://jw910911.tistory.com/35&quot; data-og-url=&quot;https://jw910911.tistory.com/35&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/TwpsR/hyT5VIW9Bn/xY0ZGREdKoUFs9OPaa8DKK/img.png?width=800&amp;amp;height=800&amp;amp;face=0_0_800_800,https://scrap.kakaocdn.net/dn/bWHIT0/hyT5WgNnUs/JYmtSwG09fBJmdNj9UdwL1/img.png?width=800&amp;amp;height=800&amp;amp;face=0_0_800_800&quot;&gt;&lt;a href=&quot;https://jw910911.tistory.com/35&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://jw910911.tistory.com/35&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/TwpsR/hyT5VIW9Bn/xY0ZGREdKoUFs9OPaa8DKK/img.png?width=800&amp;amp;height=800&amp;amp;face=0_0_800_800,https://scrap.kakaocdn.net/dn/bWHIT0/hyT5WgNnUs/JYmtSwG09fBJmdNj9UdwL1/img.png?width=800&amp;amp;height=800&amp;amp;face=0_0_800_800');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;HTTP 와 TCP의 Keep-Alive&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;HTTP의 Keep-Alive HTTP 프로토콜의 Keep-Alive는 Http의 Header의 일종입니다.. 이는 HTTP/1.0에서 지원하지 않던 지속 커넥션을 가능하게 하기 위해서 쓰였습니다. 그렇다면 지속 커넥션이 뭔지부터 알아보&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;jw910911.tistory.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://nginxstore.com/blog/nginx/nginx%EB%A5%BC-%ED%88%AC%EB%AA%85-%ED%94%84%EB%A1%9D%EC%8B%9C-transparent-proxy%EB%A1%9C-%EC%82%AC%EC%9A%A9%ED%95%98%EA%B8%B0/#3-7&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;https://nginxstore.com/blog/nginx/nginx%EB%A5%BC-%ED%88%AC%EB%AA%85-%ED%94%84%EB%A1%9D%EC%8B%9C-transparent-proxy%EB%A1%9C-%EC%82%AC%EC%9A%A9%ED%95%98%EA%B8%B0/#3-7&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1696512012559&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;NGINX를 투명 프록시 (transparent proxy)로 사용하기&quot; data-og-description=&quot;해당 포스트에서는 IP Transparency 및 Direct Server Return을 위한 투명 프록시 로 구성하기 위해 NGINX Open Source 및 NGINX Plus를 구성하여 TCP 및 UDP 성능을 향상시킵니다.&quot; data-og-host=&quot;nginxstore.com&quot; data-og-source-url=&quot;https://nginxstore.com/blog/nginx/nginx%EB%A5%BC-%ED%88%AC%EB%AA%85-%ED%94%84%EB%A1%9D%EC%8B%9C-transparent-proxy%EB%A1%9C-%EC%82%AC%EC%9A%A9%ED%95%98%EA%B8%B0/#3-7&quot; data-og-url=&quot;https://nginxstore.com/blog/nginx/nginx를-투명-프록시-transparent-proxy로-사용하기/&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/8nqju/hyT9BXUrSO/yvV7laUvuivkPN3PdN5OO1/img.png?width=225&amp;amp;height=300&amp;amp;face=0_0_225_300&quot;&gt;&lt;a href=&quot;https://nginxstore.com/blog/nginx/nginx%EB%A5%BC-%ED%88%AC%EB%AA%85-%ED%94%84%EB%A1%9D%EC%8B%9C-transparent-proxy%EB%A1%9C-%EC%82%AC%EC%9A%A9%ED%95%98%EA%B8%B0/#3-7&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://nginxstore.com/blog/nginx/nginx%EB%A5%BC-%ED%88%AC%EB%AA%85-%ED%94%84%EB%A1%9D%EC%8B%9C-transparent-proxy%EB%A1%9C-%EC%82%AC%EC%9A%A9%ED%95%98%EA%B8%B0/#3-7&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/8nqju/hyT9BXUrSO/yvV7laUvuivkPN3PdN5OO1/img.png?width=225&amp;amp;height=300&amp;amp;face=0_0_225_300');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;NGINX를 투명 프록시 (transparent proxy)로 사용하기&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;해당 포스트에서는 IP Transparency 및 Direct Server Return을 위한 투명 프록시 로 구성하기 위해 NGINX Open Source 및 NGINX Plus를 구성하여 TCP 및 UDP 성능을 향상시킵니다.&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;nginxstore.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://kvaps.medium.com/for-make-this-scheme-more-safe-you-can-add-haproxy-layer-between-keepalived-and-kube-apiservers-62c344283076&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;https://kvaps.medium.com/for-make-this-scheme-more-safe-you-can-add-haproxy-layer-between-keepalived-and-kube-apiservers-62c344283076&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1696515656891&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;Installing Haproxy for Kubernetes&quot; data-og-description=&quot;If you want to make this scheme more safe you can add haproxy layer between keepalived and kube-apiserver. Just install haproxy package into your system, and add the next configuration into&amp;hellip;&quot; data-og-host=&quot;kvaps.medium.com&quot; data-og-source-url=&quot;https://kvaps.medium.com/for-make-this-scheme-more-safe-you-can-add-haproxy-layer-between-keepalived-and-kube-apiservers-62c344283076&quot; data-og-url=&quot;https://kvaps.medium.com/for-make-this-scheme-more-safe-you-can-add-haproxy-layer-between-keepalived-and-kube-apiservers-62c344283076&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/cBteMv/hyT51KK3yx/5RpHkImSRKifJkX2fgwYp1/img.jpg?width=619&amp;amp;height=525&amp;amp;face=0_0_619_525,https://scrap.kakaocdn.net/dn/xyQB9/hyT9JuSsHV/8tSkrF4a8iHcjHakodFP11/img.png?width=1358&amp;amp;height=764&amp;amp;face=0_0_1358_764,https://scrap.kakaocdn.net/dn/c3hpXg/hyT51cTegd/RjKTklPX7Ynefe6vvNk3xk/img.png?width=1080&amp;amp;height=547&amp;amp;face=0_0_1080_547&quot;&gt;&lt;a href=&quot;https://kvaps.medium.com/for-make-this-scheme-more-safe-you-can-add-haproxy-layer-between-keepalived-and-kube-apiservers-62c344283076&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://kvaps.medium.com/for-make-this-scheme-more-safe-you-can-add-haproxy-layer-between-keepalived-and-kube-apiservers-62c344283076&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/cBteMv/hyT51KK3yx/5RpHkImSRKifJkX2fgwYp1/img.jpg?width=619&amp;amp;height=525&amp;amp;face=0_0_619_525,https://scrap.kakaocdn.net/dn/xyQB9/hyT9JuSsHV/8tSkrF4a8iHcjHakodFP11/img.png?width=1358&amp;amp;height=764&amp;amp;face=0_0_1358_764,https://scrap.kakaocdn.net/dn/c3hpXg/hyT51cTegd/RjKTklPX7Ynefe6vvNk3xk/img.png?width=1080&amp;amp;height=547&amp;amp;face=0_0_1080_547');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Installing Haproxy for Kubernetes&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;If you want to make this scheme more safe you can add haproxy layer between keepalived and kube-apiserver. Just install haproxy package into your system, and add the next configuration into&amp;hellip;&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;kvaps.medium.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc1122&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;https://datatracker.ietf.org/doc/html/rfc1122&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1696856680325&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;RFC 1122: Requirements for Internet Hosts - Communication Layers&quot; data-og-description=&quot;This RFC is an official specification for the Internet community. It incorporates by reference, amends, corrects, and supplements the primary protocol standards documents relating to hosts. [STANDARDS-TRACK]&quot; data-og-host=&quot;datatracker.ietf.org&quot; data-og-source-url=&quot;https://datatracker.ietf.org/doc/html/rfc1122&quot; data-og-url=&quot;https://datatracker.ietf.org/doc/html/rfc1122&quot; data-og-image=&quot;&quot;&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc1122&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://datatracker.ietf.org/doc/html/rfc1122&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url();&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;RFC 1122: Requirements for Internet Hosts - Communication Layers&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;This RFC is an official specification for the Internet community. It incorporates by reference, amends, corrects, and supplements the primary protocol standards documents relating to hosts. [STANDARDS-TRACK]&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;datatracker.ietf.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://alden-kang.tistory.com/20&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;https://alden-kang.tistory.com/20&lt;/a&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1696862344992&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;Connection Timeout과 Read Timeout 살펴보기&quot; data-og-description=&quot;오늘은 타임아웃 계의 양대 산맥 Connection Timeout과 Read Timeout에 대해 이야기 해 보려고 합니다. 두 타임아웃의 의미에 대해 살펴보며 적정한 값을 찾는 방법에 대해서 살펴 보겠습니다. Connection Tim&quot; data-og-host=&quot;alden-kang.tistory.com&quot; data-og-source-url=&quot;https://alden-kang.tistory.com/20&quot; data-og-url=&quot;https://alden-kang.tistory.com/20&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/D6k1b/hyT9BkfzDk/It0H6kAQM57Fla0TWQUgV1/img.png?width=800&amp;amp;height=497&amp;amp;face=0_0_800_497,https://scrap.kakaocdn.net/dn/j17zw/hyT9MszSFc/xSR7kYtW65cHnzvzCBHjOk/img.png?width=800&amp;amp;height=497&amp;amp;face=0_0_800_497,https://scrap.kakaocdn.net/dn/zjvQZ/hyT9IXY64D/GoDiHRlVQmvsSsJiW8wzb1/img.png?width=2080&amp;amp;height=928&amp;amp;face=0_0_2080_928&quot;&gt;&lt;a href=&quot;https://alden-kang.tistory.com/20&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://alden-kang.tistory.com/20&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/D6k1b/hyT9BkfzDk/It0H6kAQM57Fla0TWQUgV1/img.png?width=800&amp;amp;height=497&amp;amp;face=0_0_800_497,https://scrap.kakaocdn.net/dn/j17zw/hyT9MszSFc/xSR7kYtW65cHnzvzCBHjOk/img.png?width=800&amp;amp;height=497&amp;amp;face=0_0_800_497,https://scrap.kakaocdn.net/dn/zjvQZ/hyT9IXY64D/GoDiHRlVQmvsSsJiW8wzb1/img.png?width=2080&amp;amp;height=928&amp;amp;face=0_0_2080_928');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Connection Timeout과 Read Timeout 살펴보기&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;오늘은 타임아웃 계의 양대 산맥 Connection Timeout과 Read Timeout에 대해 이야기 해 보려고 합니다. 두 타임아웃의 의미에 대해 살펴보며 적정한 값을 찾는 방법에 대해서 살펴 보겠습니다. Connection Tim&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;alden-kang.tistory.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>System Engineering/Linux</category>
      <author>Hopulence</author>
      <guid isPermaLink="true">https://hopulence.tistory.com/40</guid>
      <comments>https://hopulence.tistory.com/40#entry40comment</comments>
      <pubDate>Tue, 26 Sep 2023 01:59:18 +0900</pubDate>
    </item>
  </channel>
</rss>