LOB 不当的chunk size会导致严重的空间浪费
前段时间一客户的Oracle数据库使用datapump做了迁移,发现相同数据LOB段迁移前后占用空间有原来的45G增长到了103GB, 朋友在墨天轮社区记录了这个问题click here, 主要原因是因为使用了不同的Lob Chunk Size,导致的空间浪费。这里简单的记录一下这个问题。
LOB Storage
This section summarizes LOB storage characteristics to consider when designing tables with LOB column types.
Inline and Out-of-Line LOB Storage
LOB columns store locators that reference the location of the actual LOB value. Depending on the column properties you specify when you create the table, and depending the size of the LOB, actual LOB values are stored either in the table row (inline) or outside of the table row (out-of-line).
LOB values are stored out-of-line when any of the following situations apply:
- By default. That is, if you do not specify a LOB parameter for the LOB storage clause when you create the table.
- When you explicitly specify DISABLE STORAGE IN ROW for the LOB storage clause when you create the table.
- When the size of the LOB is greater than 3964 bytes, the LOB value for the LOB instance (regardless of the LOB storage properties for the column).
- If you update a LOB that is stored out-of-line and the resulting LOB is less than 3964 bytes in size, it is still stored out-of-line.
LOB values are stored inline when any of the following conditions apply:
- When you explicitly specify ENABLE STORAGE IN ROW for the LOB storage clause when you create the table, and the size of the LOB stored in the given row is small, 4K bytes or less.
- When the LOB value is NULL (regardless of the LOB storage properties for the column).
Using the default LOB storage properties (inline storage) can allow for better database performance; it avoids the overhead of creating and managing out-of-line storage for smaller LOB values. If LOB values stored in your database are frequently small in size, then using inline storage is recommended.
CHUNK
A chunk is one or more Oracle blocks. You can specify the chunk size for the LOB when creating the table that contains the LOB. This corresponds to the data size used by Oracle Database when accessing or modifying the LOB value. Part of the chunk is used to store system-related information and the rest stores the LOB value. The API you are using has a function that returns the amount of space used in the LOB chunk to store the LOB value. In PL/SQL use DBMS_LOB.GETCHUNKSIZE
.
If the tablespace block size is the same as the database block size, then CHUNK
is also a multiple of the database block size. The default CHUNK
size is equal to the size of one tablespace block, and the maximum value is 32K.
Choosing the Value of CHUNK
Once the value of CHUNK
is chosen (when the LOB column is created), it cannot be changed. Hence, it is important that you choose a value which optimizes your storage and performance requirements.
Space Considerations
The value of CHUNK
does not matter for LOBs that are stored inline. This happens when ENABLE
STORAGE
IN
ROW
is set, and the size of the LOB locator and the LOB data is less than 4000 bytes. However, when the LOB data is stored out-of-line, it will always take up space in multiples of the CHUNK
parameter. This can lead to a large waste of space if your data is small, but the CHUNK
is set to a large number. The following table illustrates this point:
Table 4-4 Data Size and CHUNK Size
Data Size | CHUNK Size | Disk Space Used to Store the LOB | Space Utilization (Percent) |
---|---|---|---|
3500 enable storage in row | irrelevant | 3500 in row | 100 |
3500 disable storage in row | 32 KB | 32 KB | 10 |
3500 disable storage in row | 4 KB | 4 KB | 90 |
33 KB | 32 KB | 64 KB | 51 |
2 GB +10 | 32 KB | 2 GB + 32 KB | 99+ |
Performance Considerations
Accessing lobs in big chunks is more efficient. You can set CHUNK
to the data size most frequently accessed or written. For example, if only one block of LOB data is accessed at a time, then set CHUNK
to the size of one block. If you have big LOBs, and read or write big amounts of data, then choose a large value for CHUNK
.
DEMO
— test oracle 19.3 on linux ,block size 8k
create table anbob.tab_lob( id number, blob_8k blob, blob_32k blob ) lob (blob_8k) store as basicfile ( enable storage in row chunk 8192) lob (blob_32k) store as basicfile ( enable storage in row chunk 32768); insert into anbob.tab_lob values (1, empty_blob(), empty_blob()); commit; select id, dbms_lob.getchunksize(blob_8k) blob_8k, dbms_lob.getchunksize(blob_32k) blob_32k from anbob.tab_lob; ID BLOB_8K BLOB_32K ---------- ---------- ---------- 1 8132 32528 SQL> truncate table anbob.tab_lob; Table truncated. SQL> declare l_lob blob; len number:=0; BEGIN dbms_lob.createtemporary(lob_loc => l_lob, cache => true); while len<5000 loop DBMS_LOB.WRITEAPPEND(l_lob,1000,hextoraw(rpad('01',2000,'01'))); select dbms_lob.getlength(l_lob) into len from dual ; end loop; for i in 1..1000 loop INSERT INTO anbob.tab_lob VALUES (i,l_lob,l_lob); COMMIT; end loop; END; PL/SQL procedure successfully completed. SQL> select dbms_lob.getlength(blob_8k),dbms_lob.getlength(blob_32k) from anbob.tab_lob where rownum<=10; DBMS_LOB.GETLENGTH(BLOB_8K) DBMS_LOB.GETLENGTH(BLOB_32K) --------------------------- ---------------------------- 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 10 rows selected. SQL> @lob anbob.% OWNER TABLE_NAME COLUMN_NAME SEGMENT_NAME TABLESPACE_NAME INDEX_NAME CHUNK PCTVERSION RETENTION FREEPOOLS CACHE LOGGING ENCR COMPRE DEDUPLICATION IN_ FORMAT PAR SEC SEG RETENTI RETENTION_VALUE --------- - -------------- --------------- ------------------------------ ------------------------------ ------------------------------- ---------- ---------- ---------- ---------- ---------- ------- ---- ------ --------------- --- --------------- --- --- --- ------- --------------- ANBOB TAB_LOB BLOB_8K SYS_LOB0000079303C00002$$ USERS SYS_IL0000079303C00002$$ 8192 900 NO YES NONE NONE NONE YES NOT APPLICABLE NO NO YES YES ANBOB TAB_LOB BLOB_32K SYS_LOB0000079303C00003$$ USERS SYS_IL0000079303C00003$$ 32768 900 NO YES NONE NONE NONE YES NOT APPLICABLE NO NO YES YES SQL> @seg anbob.sys_ SEG_MB OWNER SEGMENT_NAME SEG_PART_NAME SEGMENT_TYPE SEG_TABLESPACE_NAME BLOCKS HDRFIL HDRBLK ---------- ---------- ------------------------------ --------------- - -------------------- ------------------------------ ---------- ---------- ---------- 0 ANBOB SYS_IL0000079303C00002$$ LOBINDEX USERS 8 12 27234 0 ANBOB SYS_IL0000079303C00003$$ LOBINDEX USERS 8 12 27250 9 ANBOB SYS_LOB0000079303C00002$$ LOBSEGMENT USERS 1152 12 27226 33 ANBOB SYS_LOB0000079303C00003$$ LOBSEGMENT USERS 4224 12 27242 SQL> SELECT SUM(BYTES) FROM DBA_EXTENTS WHERE SEGMENT_NAME = 'SYS_LOB0000079303C00003$$'; SUM(BYTES) ---------- 34603008 SQL> SELECT SUM(BYTES) FROM DBA_EXTENTS WHERE SEGMENT_NAME = 'SYS_LOB0000079303C00002$$'; SUM(BYTES) ---------- 9437184 SQL> SELECT SUM(DBMS_LOB.GETLENGTH(blob_8k)) "LOB DATA" FROM anbob.tab_lob; LOB DATA ---------- 5000000 SQL> SELECT SUM(DBMS_LOB.GETLENGTH(blob_32k)) "LOB DATA" FROM anbob.tab_lob; LOB DATA ---------- 5000000
Note:
可见对于平均长度为5000bytes的LOB段, 32K的lob chunk浪费了更多的存储空间。
对不起,这篇文章暂时关闭评论。