Benefit of Database Reconstruction
- Enhance database performance by eliminating existing data and index fragmentation
- Minimize future data fragmentation by switch to Type II storage area (new feature from OE 10)
- Reclaim and save disc space usage (around 10% reclaimable)
- Find and fix problem records
Step 1 - Create Empty database
Using 8k block size
Create a new empty database with 8k block size can improve performance by generating less disk I/O when running large reports.
Optimize DB structure file
Recreate large size extent into multiple 2G sized extent
If any single DB extent file physical size larger than 2G, better add new extent for it. And resize previous extents size to one or more 2G fixed size file(s).
- Change area 27 (table "trgl_det") first extent (.d1) to 2G fixed size
- Change area 28 (index area for table "trgl_det") first extent (.d1) to 2G fixed size
- Add one more extent to "trhist"(area 25) and "gltrhist" table(area 35)
Create new DB with Type II storage area
Type II storage area is new feature introduced in OpenEdge 10. The main benefit of Type II storage area is it can reduce future data fragmentation by pre-allocate block cluster for tables. For more information about Type II storage area, please find here.
Move application tables out from 'Schema Area'
Reasons for this change
- Current CAS(China Accounting System) tables are defined and created in 'Schema Area'. But, from progress's best practice, 'Scheam Area' should be used by database itself and only stores it's meta data. Progress recommends that place all application data in additional data areas.
- 'Schema Area' cannot be reconstructed to Type II storage area.
- The major CAS table(xglh_hist) size is large and contains many records. Move all CAS tables to a new and dedicated Type II storage area can bring performance improvement.
- Add a new area 'CAS' in .st file with independent extent to hold CAS tables
d "CAS":37,128;64 /path/to/db/db-name_37.d1 f 2097152 d "CAS":37,128;64 /path/to/db/db-name_37.d2
Recalculate records per block
Since our old DB block size is 4k, and if we recreate new DB with 8k block size, need to recalculate and update records per block setting in current .st file.
This article explains detail practice about how to calculate the best number of records per block.
Step 2 - Create New Database Files
Create a void DB with optimized .st file
prostrct create db-name db-name.st -blocksize 8192
Adding schema to a void DB
procopy $DLC/emptyDB_GB2312/empty8.db db_name
Step 3 - Dumping
- Don't dump those hidden tables(table name start with underscore '_' ), they are system control tables which is not part of business data and will be auto create and update after data loading.
Update dumped .df files
Since we plan to move out CAS tables from current 'Schema Area' to 'CAS' Area. Then need make below changes to dumped .df file:
AREA "Schema Area"to
AREA "CAS"in db_name.df
Also we need move out all customization tables from current customization db's ''Schema Area' to 'Customize Area'
AREA "Schema Areato
AREA "CUSTOMIZATION"in at3locprod.df
Dump table contents
- Dump all table contents from current DB in text format using Data Dictionary tool
- Don't use binary dump as it will not optimize anything
Create a bulk loader description file
Error Records Found
Note: No error reported during the dumping phase, but it does not mean everything is fine. Above errors are reported during data loading phase as there's control character from records. Probably input by user when copy & pasting date from Excel.
Step 4 - Data Loading
1st Round Load
Load database definitions from Data Dictionary
Load table contents with bulk loading from command line
proutil db-name -C bulkload DescriptionFile.fd -B 1000000
Find error records in table_name.e file
Try to fix found error items manually
2nd Round Load
With plan A
- You don't need rebuild index after data load as Data Dictionary will do it automatically when loading .d file.
- You also don't need create description files
Delete entire DB and recreate. Because in 1st round loading, corrupted records may loadded into database
Load .d for small and median size tables from Data Dictionary
Load .d for large size tables from Data Dictionary one after another
``` Tip: When using Data Dictionary with data loading, it builds index at same time. But this usually exhausts your RAM and the loading session will get killed by OS. You may split those large .d files into several small files by linux 'split' command. Then, load sliced .d files one by one.
COMMAND: split -l 1000000 xxxx.d ```
- Delete entire DB and recreate. Because in 1st round loading, corrupted records may loadded into database
- Bulk load small and median size tables and leave those large size tables
- Rebuild Index for all small and median tables
- Bulk load large size table and rebuild their index one by one
Backup new database before starting rebuild index
rebuild all inactive index
proutil db-name -C idxbuild inactiveindexes
rebuild index for specific table
proutil db-name -C idxbuild table table_name
Create DB analyse report
Review Scatter Index for a newly restructured database. This number is important for future reference. Since it is the best achievable Scatter Index value you can have. This is the baseline number against which you should compare future Scatter Index measurements.
proutil db_name -C dbanalys> dbname_analys.txt
Initialize sequence in QAD
As dump/load does not cover any sequence information, need initialize/reset sequence number from QAD application (36.16.17 utsequp.p).
Note: This is a critical setup, please don't forget do it.
The entire database reconstruction takes much effort and usually need 1~2 days depends on hardware. Different problems and issues may found from your application and data.
Please fully test above plan in non-production environment before you try it in production.