Release Notes

MindSpore 2.3.0-rc2 Release Notes

Major Features and Improvements

AutoParallel

  • [STABLE] Transpose/Sub/Add/Mul/Div/ReLU/Softmax/Sigmoid supports layout configuration.

  • [STABLE] The collective communication precision will affect network convergence. The configuration item force_fp32_communication is provided in the interface mindspore.set_auto_parallel_context. When set to True, the communication type of the reduce communication operator can be forced to be converted to float32.

  • [BETA] Pipeline parallel support Interleave. Optimize the performance when micro batch is limited.

  • [BETA] Optimize checkpoint transformation speed when using pipeline parallel, support single stage transform.

PyNative

API Change

Add timeout environment variables in dynamic networking scenarios:

  • MS_TOPO_TIMEOUT: Cluster networking phase timeout time in seconds.

  • MS_NODE_TIMEOUT: Node heartbeat timeout in seconds.

  • MS_RECEIVE_MSG_TIMEOUT: Node timeout for receiving messages in seconds.

Added new environment variable MS_ENABLE_LCCL to support the use of LCCL communication library.

Bug Fixes

  • [#I9CR96] Fix the issue of insufficient timeout time causing failure for dynamic networking startup in large-scale clusters.

  • [#I94AQQ] Fixed the problem of incorrect output shape of ops.Addcdiv operator in graph mode.

Contributors

Thanks goes to these wonderful people:

bantao,caifubi,changzherui,chenfei_mindspore,chenweifeng,dairenjie,dingjinshan,fangzehua,fanyi20,fary86,GuoZhibin,hanhuifeng,haozhang,hedongdong,Henry Shi,huandong1,huangbingjian,huoxinyou,jiangchenglin3,jiangshanfeng,jiaorui,jiaxueyu,jxl,kairui_kou,lichen,limingqi107,liuluobin,LLLRT,looop5,luochao60,luojianing,maning202007,NaCN,niyuxin94520,nomindcarry,shiziyang,tanghuikang,TronZhang,TuDouNi,VectorSL,wang_ziqi,wanghenchang,wudawei,XianglongZeng,xiaoxiongzhu,xiaoyao,yanghaoran,Yanzhi_YI,yao_yf,yide12,YijieChen,YingLai Lin,yuchaojie,YuJianfeng,zangqx,zhanghanLeo,ZhangZGC,zhengzuohe,zhouyaqiang0,zichun_ye,zjun,ZPaC,zyli2020,冯一航,李林杰,刘力力,王禹程,俞涵,张栩浩,朱家兴,邹文祥

Contributions of any kind are welcome!

MindSpore 2.3.0-rc1 Release Notes

Major Features and Improvements

DataSet

  • [STABLE] Support integrity check, encryption and decryption check for MindRecord to protect the integrity and security of user data.

  • [STABLE] MindRecord api changes: FileWriter.open_and_set_header is deprecated since it has been integrated into FilterWriter, if the old version code reports an error, delete this call; Add type checking for data in FileWriter to ensure that the data type defined by the Schema matches the real data type; The return value of all methods under Mindrecord are removed, replaced by an exception when processing error is occurred.

  • [STABLE] Support Ascend processing backend for the following transforms: ResizedCrop, HorizontalFlip, VerticalFlip, Perspective, Crop, Pad, GaussianBlur, Affine.

  • [STABLE] Optimized the content of data processing part in model migration guide, providing more examples to compare with third-party frameworks.

  • [STABLE] Optimized the parsing efficiency of TFRecordDataset in multiple data columns scenario, improving the parsing performance by 20%.

PIJIT

  • [BETA]PIJit analyzes and adjusts the Python bytecode and performs graph capture and graph optimization on the execution flow. Supported Python codes are executed in static graph mode, and unsupported ones are divided into subgraphs and executed in dynamic graph mode, automatically achieving dynamic and static unification. Users can enable the PIJit function by decorating the function with @jit(mode=”PIJit”, jit_config={options:value}).

Inference

  • [DEMO] The integrated architecture of large model inference, upgrade, training, and promotion unifies scripts, distributed policies, and runtime. The period from training to inference deployment of typical large models is reduced to days. Large operators are integrated to reduce the inference latency and effectively improve the network throughput.

AutoParallel

  • [STABLE] Add msrun startup method to launch distributed job with single instruction.

  • [STABLE] Add to be deprecated hint for RankTable startup method.

  • [STABLE] Eliminate redundant constants in graph mode to improve compilation performance and memory overhead.

  • [STABLE] The subgraph scenario optimizer parallelizes the first subgraph inline, allowing some computation and communication masking under pipeline parallelism to be performed.

  • [STABLE] Communication information export: export model communication information (communication domain, communication volume) during compilation, and input it to the cluster as the basis for communication scheduling.

  • [STABLE] Pipeline parallel inference is optimized, eliminates shared weights forwarding between stages, improving execution performance. Supports automatic broadcast of pipeline inference results, improving the usability of autoregressive inference.

  • [STABLE] Operator-level parallel sharding supports the configuration of the mapping between the device layout and tensor layout during MatMul/Add/LayerNorm/GeLU/BiasAdd operator sharding.

  • [STABLE] Supports gradient communication and backward calculation overlapping in the data parallel dimension.

  • [STABLE] Single device simulation compilation, used to simulate the compilation process of a certain device in multi device distributed training, assisting in analyzing the compilation processes and memory usage on the front and back ends.

  • [STABLE] Implement ops.Tril sharding to reduce the memory and performance requirements on a single device.

  • [BETA] Supports the fusion between communication operators and computing operators, in order to overlap communication overheads with computation and improve network performance.

  • [BETA] Load checkpoints and compile graphs in parallel to accelerate fault recovery.

Runtime

  • [BETA] Support O0/O1/O2 multi-level compilation to improve static graph debugging and tuning capabilities.

FrontEnd

  • [STABLE] The framework supports the bfloat16 data type. dtype=mindspore.bfloat16 can be specified when a tensor is created.

  • [STABLE] The syntax support capability of the rewrite component is optimized, syntaxs such as class variables, functions, and control flows can be parsed.

  • [STABLE] New context setting: debug_level. User can use mindspore.set_context(debug_level=mindspore.DEBUG) to get more debug information.

Profiler

  • [BETA] Dynamically start and stop profiling. Users can collect profiling data in real time according to the training situation, reducing the amount of data collected.

  • [BETA] Profiling the communication operator time-consuming matrix. Users can find cluster communication performance bottlenecks by analyzing the communication operator time-consuming matrix.

  • [BETA] Improve the performance of Ascend environment in parsing profiling data.

  • [BETA] Supports offline analysis of data generated by Profiling. Users can collect data first and then parse the data as needed.

  • [BETA] Supports collecting performance data of On-Chip Memory, PCIe, and l2_cache to enrich performance analysis indicators.

Dump

  • [BETA] The statistical information saved by Dump records MD5 values, and users can determine small differences in tensor values through MD5 values.

  • [BETA] Dump supports the float16 data type and supports users to locate float16 type operator accuracy issues.

PyNative

  • [STABLE] Reconstruct the single operator calling process for dynamic graphs to improve the performance of dynamic graphs.

Ascend

  • [BETA] Support set configuration options of CANN, which are divided into two categories: global and session. Users can configure them through mindspore.set_context(Ascend_configuration={“ge_options”: {“global”: {“global_option”: “option_value”}, “session”: {“session option”: “option_value”}}).

API Change

  • Add mindspore.hal API to support stream, event, and device management capabilities.

  • Add mindspore.multiprocessing API to provide the capability of creating multiple processes.

Operators

  • [BETA] mindspore.ops.TopK now supports the second input k as an int32 type tensor.

Bug Fixes

  • [#I92H93] Fixed the issue of ‘Launch kernel failed’ when using the Print operator to print string objects on the Ascend platform.

  • [#I8S6LY] Fixed RuntimeError: Attribute dyn_input_sizes of Default/AddN-op1 is [const vector]{}, of which size is less than 0 error of variable-length input operator, such as AddN or Concat, for dynamic shape process in graph mode on the Ascend platform.

  • [#I9ADZS] Fixed the data timeout issue in network training due to inefficient dataset recovery in the fault recovery scenario.

Contributors

Thanks goes to these wonderful people:

AlanCheng511,AlanCheng712,bantao,Bingliang,BJ-WANG,Bokai Li,Brian-K,caifubi,cao1zhg,CaoWenbin,ccsszz,chaiyouheng,changzherui,chenfei_mindspore,chengbin,chengfeng27,chengxb7532,chenjianping,chenkang,chenweifeng,Chong,chuht,chujinjin,Cynthia叶,dairenjie,DavidFFFan,DeshiChen,douzhixing,emmmmtang,Erpim,fangzhou0329,fary86,fengxun,fengyixing,fuhouyu,gaoshuanglong,gaoyong10,GaoZhenlong,gengdongjie,gent1e,Greatpan,GTT,guoqi,guoxiaokang1,GuoZhibin,guozhijian,hangq,hanhuifeng,haozhang,hedongdong,hejianheng,Henry Shi,heyingjiao,HighCloud,Hongxing,huandong1,huangbingjian,HuangLe02,huangxinjing,huangziling,hujiahui8,huoxinyou,jiangchenglin3,jianghui58,jiangshanfeng,jiaorui,jiaxueyu,JichenZhao,jijiarong,jjfeing,JoeyLin,JuiceZ,jxl,kairui_kou,kate,KevinYi,kisnwang,lanzhineng,liangchenghui,LiangZhibo,lianliguang,lichen,ligan,lihao,limingqi107,ling,linqingke,liruyu,liubuyu,liuchao,liuchengji,liujunzhu,liuluobin,liutongtong9,liuzhuoran2333,liyan2022,liyejun,LLLRT,looop5,luochao60,luojianing,luoyang,LV,machenggui,maning202007,Margaret_wangrui,MaZhiming,mengyuanli,MooYeh,moran,Mrtutu,NaCN,nomindcarry,panshaowu,panzhihui,PingqiLi,qinzheng,qiuzhongya,Rice,shaojunsong,Shawny,shenwei41,shenyaxin,shunyuanhan,silver,Songyuanwei,tangdezhi_123,tanghuikang,tan-wei-cheng,TingWang,TronZhang,TuDouNi,VectorSL,WANG Cong,wang_ziqi,wanghenchang,wangpingan,wangshaocong,wangtongyu6,weiyang,WinXPQAQ,wtcheng,wudawei,wujiangming,wujueying,wuweikang,wwwbby,XianglongZeng,xiaosh,xiaotianci,xiaoxin_zhang,xiaoxiongzhu,xiaoyao,XinDu,xingzhongfan,yanghaoran,yangluhang,yangruoqi713,yangzhenzhang,yangzishuo,yanjiaming,Yanzhi_YI,yao_yf,yefeng,yeyunpeng2020,yide12,YijieChen,YingLai Lin,YingtongHu,youshu,yuchaojie,YuJianfeng,zangqx,zby,zhaiyukun,zhangdanyang,zhanghaibo,zhanghanLeo,zhangminli,zhangqinghua,zhangyanhui,zhangyifan,zhangyinxia,zhangyongxian,ZhangZGC,zhanzhan,zhaoting,zhengyafei,zhengzuohe,ZhihaoLi,zhouyaqiang0,zhuguodong,zhumingming,zhupuxu,zichun_ye,zjun,zlq2020,ZPaC,zuochuanyong,zyli2020,陈宇,代宇鑫,狄新凯,范吉斌,冯一航,胡彬,宦晓玲,黄勇,康伟,李良灿,李林杰,刘崇鸣,刘力力,刘勇琪,吕浩宇,没有窗户的小巷,王禹程,吴蕴溥,熊攀,徐安越,徐永飞,许哲纶,俞涵,张峻源,张树仁,张王泽,张栩浩,郑裔,周莉莉,周先琪,朱家兴,邹文祥

Contributions of any kind are welcome!