Release Notes

View Source On Gitee

MindSpore 2.2.14 Release Notes

Major Features and Improvements

Parallel

  • [STABLE] Changed the communication group of the send/receive operator to the world group in the parallel pipeline to avoid creating redundant communication groups and reduce the memory required for communication.

  • [STABLE] Optimize the compilation cache to reduce the graph conversion process of the loading cache and improve the compilation cache performance.

  • [BETA] Pipeline parallel supports Interleave. Optimize the performance when mciro batch is small.

  • [BETA] Optimize checkpoint transformation speed when using pipeline parallel, support single stage transform.

Profiler

  • [BETA] Dynamically start and stop profiling. Users can collect profiling data in real time according to the training situation, reducing the amount of data collected.

  • [BETA] Profiling the communication operator time-consuming matrix. Users can find cluster communication performance bottlenecks by analyzing the communication operator time-consuming matrix.

Dump

  • [BETA] The statistical information saved by Dump records MD5 values, and users can determine small differences in tensor values through MD5 values.

  • [BETA] Dump supports the float16 data type and supports users to locate float16 type operator accuracy issues.

Bug Fixes

  • [#I962EV] Fixed issue on CPU and GPU environment with cond input dimension of 4d, 5d, 6d, 7d and 8d.

  • [#I96E5R] Fixed the issue in the PyNative that the input of the Mul operator is NCHW format on the Ascend platform.

  • [#I96I5D] Fixed the issue of incorrect input type when calculating Scalar type in dynamic shape scenario.

  • [#I99QAB] Fixed the issue where asnumpy cannot correctly identify the bfloat16 tensor in some scenarios.

  • [#I9ADZS] Fixed the data timeout issue in network training due to inefficient dataset recovery in the fault recovery scenario.

  • [#I8Y9JT] Fixed the issue that some network training does not converge due to the incorrect execution sequence of the optimizer in some specific scenarios where the nn.SGD optimizer has a large loss_scale and a small weight_decay.

Contributors

Thanks goes to these wonderful people:

fary86, wanghenchang, haozhang, mengyuanli, emmmmtang, luoyang, zhupuxu, zhangyongxian, liuluobin, LLLRT, TuDouNi, hujiahui8, wangtongyu6, ligan, zhuguodong, yanghaoran, YingtongHu, liyejun, zjun, 徐永飞, chuht, 张树仁, 徐安越, DeshiChen, shenyaxin, liujunzhu, shunyuanhan, yuchaojie, yao_yf, 没有窗户的小巷, yeyunpeng2020, weiyang, KevinYi, hedongdong, zhouyaqiang0, Margaret_wangrui, zhanghaibo, moran, huangziling, 朱家兴, GuoZhibin, 李良灿, jiaxueyu, gaoyong10, Greatpan, 宦晓玲, melody, 俞涵, jiangshanfeng, XinDu, ling, caifubi, zhangyinxia, gengdongjie, Erpim, XianglongZeng, zhangminli, fengyixing, 冯一航, 黄勇, panzhihui, 胡彬, linqingke, wangshaocong

Contributions of any kind are welcome!

MindSpore 2.2.13 Release Notes

API Change

Add timeout environment variables in dynamic networking scenarios:

  • MS_TOPO_TIMEOUT: Cluster networking phase timeout time in seconds.

  • MS_CLUSTER_RETRY_NUM: Number of node’s retrying registration during cluster networking phase.

  • MS_NODE_TIMEOUT: Node heartbeat timeout in seconds.

  • MS_RECEIVE_MSG_TIMEOUT: Node timeout for receiving messages in seconds.

Bug Fixes

  • [#I9CR96] Fix the issue of insufficient timeout time causing failure for dynamic networking startup in large-scale clusters.

Contributors

Thanks goes to these wonderful people:

ZPaC, limingqi107, lizhenyu, jiangshanfeng

Contributions of any kind are welcome!

MindSpore 2.2.12 Release Notes

Major Features and Improvements

  • [STABLE] Optimize scenarios where network parameters are initialized by fp32, and optimizer parallel mode is on, reducing the amount of Cast operator.

  • [STABLE] Add detection and processing capabilities to silent data corruption. Silent data corruptions may lead to error during training procedures, this helps users to prevent or lower the cost of fault location, which caused by silent data corruptions.

Bug Fixes

  • [#I97D1L] Fix ReduceLROnPlateau, LRScheduler, CosineAnnealingWarmRestarts dynamic learning rate related interface sample error.

  • [#I970HV] Fix the problem where order of AllGather/ReduceScatter between two cards is not preserved.

  • [#I99JPI] Fix load checkpoint for bfloat16 parameter during vague load mode.

Contributors

Thanks goes to these wonderful people:

yao_yf, YijieChen, 冯一航, yuchaojie, 李良灿, YuJianfeng, huangxinjing, GuoZhibin, looop5

Contributions of any kind are welcome!

MindSpore 2.2.11 Release Notes

Major Features and Improvements

scipy

  • [STABLE] Add new API mindspore.scipy.optimize.linear_sum_assignment in scipy module to solve the linear sum assignment problem. It can find the least-cost assignment based on a given cost matrix.

Bug Fixes

  • [#I8JVRU] Fixed the problem where the results of the bernoulli random operator running twice on the GPU are probabilistically consistent.

  • [#I8OC32] Fixed the segmentation fault error because the MatrixSetDiagV3 operator does not verify abnormal input.

Contributors

Thanks goes to these wonderful people:

fary86, wanghenchang, haozhang, mengyuanli, emmmmtang, luoyang, zhupuxu, zhangyongxian, liuluobin, LLLRT, TuDouNi, hujiahui8, wangtongyu6, ligan, zhuguodong, yanghaoran, YingtongHu, liyejun, zjun, 徐永飞, chuht, 张树仁, 徐安越, DeshiChen, shenyaxin, liujunzhu, shunyuanhan, yuchaojie, yao_yf, 没有窗户的小巷, yeyunpeng2020, weiyang, KevinYi, hedongdong, zhouyaqiang0, Margaret_wangrui, zhanghaibo, moran, huangziling, 朱家兴, GuoZhibin, 李良灿, jiaxueyu, gaoyong10, Greatpan, 宦晓玲, melody, 俞涵, jiangshanfeng, XinDu, ling, caifubi, zhangyinxia, gengdongjie, Erpim, XianglongZeng, zhangminli, fengyixing, 冯一航, 黄勇, panzhihui, 胡彬, linqingke, wangshaocong

Contributions of any kind are welcome!

MindSpore 2.2.10 Release Notes

Major Features and Improvements

Operators

  • [STABLE] FastGelu, BatchMatMul, AllReduce, AllGather, Broadcast, ReduceScatter support bfloat16 data type

  • [STABLE] AllGather support uint8 data type

Bug Fixes

  • [#I8ALW3] Fixed networks including Faster R-CNN, DeepText, MaskRCNN-ResNet50, which had errors while training RandomChoiceWithMask operator in Ascend 910 8P scenario.

  • [#I8LKG7] Fixed graph compilation error of UNet-2D in Ascend 910 1P/8P scenario.

  • [#I8KU3X] Fixed CRNN-ResNet34 network, which stuck in training phase in Ascend 910 1P/8P PyNative mode.

  • [#I8KTHH] Fixed BERT network error when training without allreduce grouped fusion with enable_parallel_optimizer=True, in Ascend 910 8P scenario.

Contributors

Thanks goes to these wonderful people:

李林杰, TuDouNi, chengxb7532, Henry Shi, rms-infer-type, 朱家兴, zhouyaqiang0, tanghuikang, gaoyong10, gengdongjie, yao_yf, hujiahui8, hanhuifeng, shenyaxin, KevinYi, 冯一航, chengfeng27, JuiceZ, zhangyanhui, jijiarong, xiaoxiongzhu, 没有窗户的小巷, ling, liyan2022, haozhang, zangqx, xiaoyao, liujunzhu, 胡彬, panzhihui, wangshaocong, linqingke, jianghui58, qiuzhongya, yangruoqi713, zhangminli, moran, 王禹程, shaojunsong, wangtongyu6, zhupuxu, luoyang, 徐安越, qinzheng, caifubi, 徐永飞, chenkang, youshu, XinDu, liubuyu, jxl, yeyunpeng2020, huoxinyou, yefeng, jiaorui, wangpingan, cao1zhg, zjun, zyli2020, yanjiaming, Cynthia叶, 胡安东, 李良灿, liruyu, liuluobin, lihao, huangbingjian, YijieChen, jjfeing, looop5, 刘力力, xiaoxin_zhang, yangluhang, chenweifeng, jiangshanfeng, zichun_ye, 陈宇, NaCN, ligan, YingLai Lin, huangziling, chenjianping, DeshiChen, chengbin, kairui_kou, ccsszz, yanghaoran, zhangdanyang, Yanzhi_YI, zhengzuohe, hangq, TronZhang, wanghenchang, HighCloud, 吕浩宇, VectorSL, ZPaC, mengyuanli, maning202007, 刘勇琪, r1chardf1d0, fary86, 刘崇鸣, yuchaojie, douzhixing, fengyixing

Contributions of any kind are welcome!

MindSpore 2.2.1 Release Notes

Bug Fixes

  • [#I7R3R5] Fixed the problem that the network precision of the ResNet-50 on the Ascend platform deteriorates.

  • [#I8A9RH] Fixed an issue where the DBNet(ResNet-50) network precision on the Ascend platform deteriorates.

  • [#I8B8IW] Fixed the segment error caused by out-of-bounds multi-dimensional tensor assignment.

  • [#I8J0F4] Fixed an issue where the multidimensional Tensor extension dimension fails to be executed in the dynamic graph.

  • [#I87P3P] Fixed an issue where the compilation cache fails to be loaded during secondary training on the Ascend platform.

  • [#I86GP9] Fixed an issue where the UNet3D network inference precision deteriorates on the Ascend platform.

  • [#I89B4K] Fixed an issue where the dynamic rank execution of dynamic graphs on the Windows platform is suspended.

  • [#I8CX0C] Fixed an issue where dynamic images occasionally fail in mixed precision mode on the Ascend platform.

  • [#I8BGCF] Fixed an issue where a segment error occurs when the command is executed in dynamic diagram mode of the AirNet network on the Ascend platform.

  • [#I8L5DS] Fixed an issue where the ResNet-50 image segmentation network dynamic image is executed slowly on the Ascend platform.

Contributors

Thanks goes to these wonderful people:

yufan, dingcheng, lvzhangcheng, zhunaipan, fangwenyi, weiyang, changzherui, chujinjin, zangqingxiang, yuchaojie, wuweikang, tanghuikang, xiaoyao, huangbinjian, zhoupeichen, chenfei_mindspore, hedongdong, wangnan, zhengzuohe, yanghaoran, zouliqin, luoyang, liuchongmin, lujiale, machenggui, wangcong, lixiangyi, wangting, huangyong

Contributions of any kind are welcome!

MindSpore 2.2.0 Release Notes

Major Features and Improvements

DataSet

  • [STABLE] The row_size parameter of data operation map/batch is extended to support passing list, which stands for [Input Shared Memory, Output Shared Memory], so as to flexibly control the size of shared memory in multi-process mode.

  • [STABLE] Provide 100% mindspore.dataset and mindspore.dataset.transforms samples for reference.

  • [STABLE] ConcatDataset supports global sampling. After combining data from multiple sources using concat operation, data can be globally sampled randomly to enhance data diversity.

  • [STABLE] When the model.train API is used for training, TimeMonitor(.., data_time=True) can be used to monitor data processing performance in real time.

  • [STABLE] Introduced the jemalloc library to solve the problem of slow memory rise due to untimely memory debris recovery in extreme scenarios.

FrontEnd

  • [STABLE] Support adding decorator @lazy_inline to make a graph generated from cell being inlined lazily, which can improve the compilation performance effectively.

  • [STABLE] Optimize the function of mixed precision training, support automatic rewriting of Python scripts through rewrite to achieve mixed precision strategies, and support automatic parsing of functions, branch statements, and other syntax.

  • [STABLE] Mixed precision function optimization, ReWrite supports syntax parsing of class functions and branch statements, and extends O1 functionality.

  • [STABLE] Optimize the dynamic learning rate function and add APIs such as MultiStepLR; function get_lr and global_step decoupling, extending optimizer module functionality.

  • [STABLE] Optimize API code samples, API difference tables, and tutorials for using higher-order functions.

Operator

  • [STABLE] Add new operator primitive mindspore.ops.Dense.

  • [STABLE] Add the random number operator state management feature, which allows the random number operator to save the state of the random number, and can be stably reproduced in scenarios such as model parallelism and recalculation. Currently, it only supports CPU/GPU platforms, and the involved random number operators include: mindspore.ops.Multinomial, mindspore.ops.MultinomialWithReplacement, mindspore.ops.ParameterizedTruncatedNormal, mindspore.ops.StandardLaplace, mindspore.ops.StandardLaplace, mindspore.ops.Uniform, mindspore.ops.UniformInt, mindspore.ops.UniformReal, mindspore.ops.UniformInt, mindspore.ops.Dropout, mindspore.ops.RandomChoiceWithMask, mindspore.ops.RandomCategorical, mindspore.ops.RandomShuffle, mindspore.ops.RandamGamma, mindspore.ops.RandomPoisson and mindspore.ops.TruncatedNormal.

  • [STABLE] When a GPU operator encounters an illegal input scenario, it supports asynchronously printing error logs in the CUDA kernel of the operator to the Host side and interrupting the execution of the current CUDA Stream, improving the efficiency of user operator problem positioning.

PyNative

  • [STABLE] Support viewing mechanism in PyNative mode.

  • [STABLE] Function enhancement in PyNative mode: sens supports dict input type.

Ascend

  • [STABLE] Supports user configurable operator high-precision/high-performance mode, users can use context.set_context(ascend_config={"op_precision_mode": "/path/to/op_precision_config_file"}) to configure high-precision/high-performance modes for some TBE operators.

  • [BETA] Supports user configurable operators for fp16-in and fp32-out, users can use context.set_context(ascend_config={"precision_mode": "force_fp32"}) to configure fp16-in and fp32-out for the TBE Cube operators.

  • [BETA] Remove the strong binding between jit_level="O3" and GE processes, so users no longer need to set jit_level="O3" when executing GE processes.

Parallel

  • [STABLE] Support the gradient accumulation feature in non-pipeline parallel scenarios in semi-automatic/fully automatic mode. Users can enable gradient accumulation by writing net = GradAccumulationCell(net, micro_size). The gradient accumulation feature is compatible with the lazy_inline feature.

Inference

Since version 2.2, the MindSpore main release package does not provide the inference interface enabling for the Ascend 310. If you need to use the inference interface, install the MindSpore Lite release package or download the MindSpore version earlier than 2.0. For details about how to install and use MindSpore Lite, see https://www.mindspore.cn/lite/en. HUAWEI Ascend 310 (Ascend) is an energy-efficient and highly integrated AI processor for edge scenarios. It supports inference on MindIR models. In the earlier version, MindSpore provides two methods for enabling inference on the Ascend 310 hardware:

  1. The MindSpore main release package provides the matching Ascend 310 version that supports C++ inference interfaces.

  2. The MindSpore Lite release package provides the matching Ascend version and supports C++ and Java inference.

The C++ APIs provided by the two solutions are basically the same. In the future, MindSpore Lite is used instead of building and maintaining two sets of interfaces. The original 310 inference service built based on the MindSpore main release package can be switched to MindSpore Lite with a few modifications. For details, see https://www.mindspore.cn/docs/en/r2.2/faq/inference.html.

Bug fixes

  • [I7SDA0] Fixed an issue where the accuracy of the CRNN network deteriorates on the NES platform.

  • [I7T4QK] Fixed an issue where the inference precision of the WGAN network deteriorates on the OptiX OSN 8800 platform.

  • [I7TJ8Z] Fixed an issue where the inference precision of the LGTM network deteriorates on the OptiX OSN 8800 platform.

  • [I7M58O] Fixed ASR-dynamic network training core dump issue on Ascend platform.

  • [I7L6B6] Fixed an issue where child processes do not exit in some scenarios when dataset is in multi-process mode.

  • [I7L7AE] Fixed an issue where dataset pipeline contains repeat operations and dynamic batchinfo.get_epoch_num() is incorrectly used in dataset.batch.

  • [I7UY7G] Rectify the file permission modification error in OBSMindDataset.

Contributors

Thanks goes to these wonderful people: bantao, Bingliang, BJ-WANG, Brian-K, caifubi, ccsszz, changzherui, chenfei_mindspore, chengfeng27, chenhaozhe, chenjianping, chenkang, chenweifeng, chuht, chujinjin, CShu0507, Cynthia叶, DeshiChen, douzhixing, Erpim, Etienne, fary86, fengxun, fengyixing, gaoshuanglong, Gaoxiong, gaoyong10, GaoZhenlong, Greatpan, GuoZhibin, guozhijian, hangq, hanhuifeng, haozhang, hedongdong, Henry Shi, HighCloud, Hongxing, huangbingjian, huanghui, huangxinjing, huangziling, hujiahui8, huoxinyou, HWalkingMan, jianghui58, jiangshanfeng, jiaorui, jijiarong, jjfeing, JuiceZ, jxl, KevinYi, kisnwang, KXiong, lanzhineng, Li Qingguo, LiangZhibo, lianliguang, ligan, lihao, Lihoon, limingqi107, ling, linqingke, liruyu, liubuyu, liuchao, liujunzhu, liuluobin, liupeng303, liutongtong9, liyan2022, liyejun, looop5, luochao60, luojianing, luoyang, machenggui, maning202007, Margaret_wangrui, MaZhiming, mengyuanli, moran, NaCN, nomindcarry, panshaowu, panzhihui, qinzheng, qiuzhongya, r1chardf1d0, shaojunsong, shenwei41, shenyaxin, shenzhangyi, Shira Zaloshinski, shunyuanhan, tangdezhi_123, tanghuikang, tan-wei-cheng, tan-wei-cheng-3260, TronZhang, TuDouNi, VectorSL, wang_ziqi, wanghenchang, wangpingan, wangshaocong, wangtongyu6, wtcheng, wujueying, XianglongZeng, xiaotianci, xiaoxin_zhang, xiaoxiongzhu, xiaoyao, xiaoyuanyuan, XinDu, xujinliang, xupan, yanghaoran, yangluhang, yangruoqi713, yangsijia, yangzhenzhang, yangzishuo, yanjiaming, Yanzhi_YI, yao_yf, yefeng, yeyunpeng2020, yide12, YijieChen, YingLai Lin, YingtongHu, yonibaehr, youshu, yuchaojie, YuJianfeng, zangqx, zhaizhiqiang, zhangbuxue, zhangchunlei, zhangdanyang, zhangdong, zhanghaibo, zhangminli, zhangqi, zhangqinghua, zhangyanhui, zhangyifan, zhangyongxian, zhangzhen, zhangzheng, zhanzhan, zhengzuohe, ZhihaoLi, zhoufeng, zhouyaqiang0, zhuguodong, zhupuxu, zichun_ye, zjun, ZPaC, zuochuanyong, zyli2020, 陈宇, 程超, 范吉斌, 冯浩, 冯一航, 胡彬, 宦晓玲, 黄勇, 雷元哲, 黎冠新, 李良灿, 李林杰, 刘崇鸣, 刘力力, 刘思铭, 刘勇琪, 吕浩宇, 没有窗户的小巷, 沈竞兴, 王禹程, 王振邦, 徐安越, 徐永飞, 俞涵, 张澍坤, 周超, 朱家兴

Contributions of any kind are welcome!