Карточка документа

Method of setting artificial intelligence execution model and artificial intelligence execution acceleration system for artificial intelligence execution acceleration

ID US0011257008B2_20220222
Страна US Номер 0011257008 Вид B2 Дата 2022.02.22

Основная информация

Страна публикации
US
Номер документа
0011257008
Вид документа
B2
Дата публикации
2022.02.22
Номер заявки
17291753
Дата подачи заявки
2018.11.13
Номер приоритетной заявки
Нет данных
Дата приоритета
Нет данных
Страна приоритета
Нет данных

Классификация

МПК

  • G06N20/10
    Раздел G
    Класс 06
    Подкласс N
  • G06F9/445
    Раздел G
    Класс 06
    Подкласс F

CPC / СПК

  • G06N20/10
    Раздел G
    Класс 06
    Подкласс N
  • G06F9/44505
    Раздел G
    Класс 06
    Подкласс F

Служебные сведения

Dataset
us
Index
may22_us

Участники

Заявители

  • Soynet Co., Ltd.

Авторы / изобретатели

  • Jung Woo Park
  • Dong Won Eom
  • Yong Ho Kim

Патентообладатели

  • Soynet Co., Ltd.

Реферат

[0000]
An artificial intelligence execution acceleration system and a method of setting an artificial intelligence execution model are provided. The system includes: an execution weight extraction module for analyzing a learning model that includes an artificial intelligence model and a weight file generated as a result of artificial intelligence learning, and calculating a data weight of the learning model of artificial intelligence for artificial intelligence execution acceleration; an artificial intelligence accelerated execution file setting module for loading the learning model from an artificial intelligence learning server that calculates the learning model, converting the loaded learning model into a custom layer usable in the artificial intelligence execution acceleration system, and then optimizing the custom layer to calculate an execution model; and an artificial intelligence execution acceleration module for receiving the execution model, configuring an execution environment corresponding to the execution model, and accelerating execution speed of artificial intelligence.

[00000]

Формула

1. An artificial intelligence execution acceleration system comprising:

an execution weight extraction module for calculating a data weight of a learning model of artificial intelligence for artificial intelligence execution acceleration in the learning model that includes an artificial intelligence model and a weight file generated as a result of artificial intelligence learning;

an artificial intelligence accelerated execution file setting module for loading the learning model from an artificial intelligence learning server that calculates the learning model, converting the loaded learning model into a custom layer configured for use in the artificial intelligence execution acceleration system, and then optimizing the custom layer through a process of adjusting an operation function and a module position and modifying an operation method to calculate an execution model;

an optimization module for performing an optimization process of the custom layer with an optimization operation function configured to perform a ReLU operation after a Concat operation so as to perform ReLU operations as a single operation, generating the execution model when optimization of the custom layer is completed, and then applying a weight value of the execution model received from the execution weight extraction module to the execution model; and

an execution acceleration module for receiving the execution model, configuring an execution environment corresponding to the execution model, and accelerating execution speed of artificial intelligence,

wherein the execution acceleration module calculates an optimum value of an allocated amount of a memory required for each execution step of the execution model, checks completion of each execution step including parallel processing, reuses a memory area from which data not reused in a completed step was deleted, transforms data processing between a CPU and a GPU, and processes the execution model of artificial intelligence inside the GPU in an async mode to minimize an occurrence of overhead,

the artificial intelligence accelerated execution file setting module generates the custom layer by visualizing metadata of the learning model by using operation functions including convolution and ReLU in the learning model of artificial intelligence, sets a visualized learning model file as the custom layer by using model setting functions including RPN (region proposal network), NMS (non-maximum suppression), and pooling,

the artificial intelligence accelerated execution file setting module adjusts the module position of the custom layer by combining the ReLU operations into the single operation and notifying that parallelism is present, and modifies a pooling operation method of the custom layer by an Even selection operation that selects only even terms from a matrix and an Odd selection operation that extracts odd terms from the matrix,

the Even selection operation is performed by calculating AvgPool 1×1/2 (where, kernel size=1×1, stride size=2×2) according to an average pooling method, and

the Odd selection operation is performed by calculating AvgPool 1×1/2 after cropping and padding according to a crop method, a padding method, and the average pooling method.

2. The artificial intelligence execution acceleration system of
claim 1
, wherein the execution weight extraction module extracts a weight file format previously stored in the artificial intelligence learning server.

3. A method of setting an execution model of artificial intelligence for artificial intelligence execution acceleration, the method comprising:

(A) loading, from an artificial intelligence learning server, a weight file generated as a result of learning in the artificial intelligence learning server and a learning model including an artificial intelligence model;

(B) visualizing metadata of the learning model in the learning model by using operation functions including convolution and ReLU, by an artificial intelligence accelerated execution file setting module;

(C) performing an optimization process of a custom layer with an optimization operation function configured to perform a ReLU operation after a Concat operation so as to perform ReLU operations as a single operation in an optimization module, generating the execution model when optimization of the custom layer is completed, and applying a weight value of the execution model received from an execution weight extraction module to the execution model;

(D) setting a visualized learning model file as the custom layer configured for use in an artificial intelligence execution accelerator by using custom layer setting functions including RPN (region proposal network), NMS (non-maximum suppression), and pooling, by the artificial intelligence accelerated execution file setting module;

(E) converting the custom layer into the execution model configured for use in the artificial intelligence execution accelerator by adding the loaded weight file to the custom layer by the artificial intelligence accelerated execution file setting module; and

(F) accelerating execution speed of artificial intelligence by receiving the execution model from an artificial intelligence execution acceleration module and configuring an execution environment corresponding to the execution model,

wherein step (D) adjusts a module position of the custom layer by combining the ReLU operations into the single operation and notifying that parallelism is present, calculates the execution model by modifying an operation method of the custom layer by an Even selection operation for selecting only even terms from a matrix and an Odd selection operation for extracting odd terms from the matrix,

the Even selection operation is performed by calculating AvgPool 1×1/2 (where, kernel size=1×1, stride size=2×2) according to an average pooling method,

the Odd selection operation is performed by calculating AvgPool 1×1/2 after cropping and padding according to a crop method, a padding method, and the average pooling method,

step (F) comprises:

calculating an allocated amount of a memory required for each artificial intelligence execution step of the execution model; and

optimizing the memory of the artificial intelligence execution acceleration module by reusing a memory area required for each artificial intelligence execution step, and

step (F) transforms data processing between a CPU and a GPU, and processes the execution model of artificial intelligence inside the GPU in an async mode so as to minimize an occurrence of overhead.

4. The method of
claim 3
, wherein step (E) comprises:

extracting a weight file format previously stored in the artificial intelligence learning server by the execution weight extraction module;

converting the extracted weight file format into the learning model; and

applying the converted weight file format to the execution model.

Описание

CROSS-REFERENCE TO RELATED APPLICATION
[0001]
This application is a Section 371 National Stage Application of International Application No. PCT/KR2018/013795, Nov. 13, 2018, the contents of which is hereby incorporated by reference in its entirety.

TECHNICAL FIELD
[0002]
The present disclosure relates to an artificial intelligence execution acceleration system and a method of setting an artificial intelligence execution model and, more particularly, to an artificial intelligence execution acceleration system, an artificial intelligence execution acceleration server, and a method of setting an artificial intelligence execution model thereof, wherein the server is configured to remove a learning function and perform only an execution function of artificial intelligence, so as to increase execution speed of artificial intelligence and reduce memory usage.

BACKGROUND ART
[0003]
Unless otherwise indicated in the present disclosure, the content described in this section is not related art to the claims of this application and is not admitted to be the related art by inclusion in this section.

[0004]
Artificial intelligence is a field of computer engineering and information technology that studies how to enable computers to perform tasks such as thinking, learning, and self-development that humans can do, and is a technology that allows the computers to imitate the intelligent behavior of humans. In addition, artificial intelligence does not exist by itself, but is directly or indirectly related to other fields of computer science. In particular, in modern times, attempts to introduce artificial intelligence elements in various fields of information technology to utilize the elements for problem solving in those fields have been made very actively, and there is a trend of expanding functions of digital devices by combining artificial intelligence with various aspects of real life.

[0005]
Artificial intelligence may be largely divided into the server that performs a learning function and the module that performs an execution function. The server performing the learning function collects vast amounts of data, finds features in the data, and performs data processing, such as a data patterning, to train electronic devices, whereas the module performing the execution function processes input data by using the value optimized through learning and provides an inference function based thereon.

[0006]
Since a learning process of artificial intelligence requires a vast amount of data throughput, artificial intelligence needs a high-performance server and tens of gigabytes of memory. When an execution process of artificial intelligence is performed by using the result of the learning process, the data processing speed is inevitably slow because high-level computational processing such as data recognition, interpretation, and patterning is continuously performed in the learning process.

[0007]
Even when the Internet is disconnected, the trained artificial intelligence should serve a role, but the conventional cloud-based artificial intelligence service is unable to perform artificial intelligence functions in an environment where connection to the Internet is not possible. In a case where a learning function of artificial intelligence is further installed in an edge terminal, on which artificial intelligence functions are performed, in order to solve this problem, the data processing speed of an artificial intelligence model becomes too slow, causing great inconvenience for practical terminal use for a user.

[0008]
In addition, in order to develop a program that uses artificial intelligence, program developers should be familiar with difficult artificial intelligence APIs, whereby there exists a limitation in developing the artificial intelligence related programs.

DISCLOSURE Technical Problem
[0009]
An objective of the present invention is to provide an artificial intelligence execution acceleration system and a method of setting an execution model of artificial intelligence, wherein a learning function and an execution function of artificial intelligence are separated in order to accelerate execution speed of artificial intelligence, so that only the execution function of artificial intelligence, excluding the learning function thereof, is performed in a smart device at an edge of an artificial intelligence system.

[0010]
In particular, as for a weight value used by an artificial intelligence execution acceleration server according to an exemplary embodiment, the weight value optimized for the execution model is generated by extracting a weight file from an external learning server, and is applied to the execution model, so that data processing speed of an artificial intelligence execution module is accelerated.

Technical Solution
[0011]
An artificial intelligence execution acceleration system according to an exemplary embodiment includes: an execution weight extraction module for calculating a data weight of a learning model of artificial intelligence for artificial intelligence execution acceleration in the learning model that includes an artificial intelligence model and a weight file generated as a result of artificial intelligence learning; an artificial intelligence accelerated execution file setting module for loading the learning model from an artificial intelligence learning server that calculates the learning model, converting the loaded learning model into a custom layer usable in the artificial intelligence execution acceleration system, and then optimizing the custom layer through the process of adjusting an operation function and a module position and modifying an operation method to calculate an execution model; and an execution acceleration module for receiving the execution model, configuring an execution environment corresponding to the execution model, and accelerating execution speed of artificial intelligence.

Advantageous Effects
[0012]
The artificial intelligence execution acceleration server as described above reduces a required amount of memory and hardware resources (i.e., CPU and GPU) by separating the learning function and execution function of artificial intelligence from each other, so that the server costs for executing an artificial intelligence model may be reduced, the processing performance may be improved, and the artificial intelligence model may be executed even on an edge device having a lower specification, thereby enabling a service using artificial intelligence to be provided even in a situation where connection to the Internet is not possible.

[0013]
Through the exemplary embodiment, the artificial intelligence model that must be executed on an expensive server may be executed on a PC-class device, and the artificial intelligence model may be quickly executed with a small memory even in a small IoT device.

[0014]
In addition, since the exemplary embodiment uses a method in which the artificial intelligence model is loaded in a s…

Цитированные документы

Gray et al. “Developing Deep Neural Betworks with NVIDIA TensorRT”, 2017, pp. 7, https://developer.nvidia.com/blog/deploying-deep-learning-nvidia-tensorrt/. Sheng et al. “EasyConvPooling: Random Pooling with Easy Convolution for Accelerating Training and Testing”, Jun. 2018, pp. 9 , https://arxiv.org/pdf/1806.01729.pdf. Kollara et al., Deep Learning Diaries: Building Custom Layers in Keras, Internet Post, Nov. 1, 2017, [Retrived on Jul. 11, 2019], Retrieved from <URL: https://www.sama.com/blog/deep-leanring-diaries-building-custom-layers-in-keras></https:>. First Office Action for Korean Application No. 10-2018-0136437 dated Apr. 22, 2020, with its English translation, 19 pages. Second Office Action for Korean Application No. 10-2018-0136437 dated Jul. 1, 2020, with its English translation, 8 pages. Decision to Grant a Patent for Korean Application No. 10-2018-0136437 dated Sep. 2, 2020, with its English translation, 2 pages. Gray et al., Deploying Deep Neural Networks with NVIDIA TensorRT, nVIDIA, Apr. 2, 2017, 12 pages. Prasana et al., TensorRT 3: Faster TensorFlow Inference and Volta Support., nVIDIA, Dec. 4, 2017, 16 pages. TensorRT User Guide—nVIDIA. DU-08540-021_v01. Jul. 2017, 31 pages.

Структурированные цитаты

  • Gray et al. “Developing Deep Neural Betworks with NVIDIA TensorRT”, 2017, pp. 7, https://developer.nvidia.com/blog/deploying-deep-learning-nvidia-tensorrt/.
  • Sheng et al. “EasyConvPooling: Random Pooling with Easy Convolution for Accelerating Training and Testing”, Jun. 2018, pp. 9 , https://arxiv.org/pdf/1806.01729.pdf.
  • Kollara et al., Deep Learning Diaries: Building Custom Layers in Keras, Internet Post, Nov. 1, 2017, [Retrived on Jul. 11, 2019], Retrieved from <URL: https://www.sama.com/blog/deep-leanring-diaries-building-custom-layers-in-keras></https:>.
  • First Office Action for Korean Application No. 10-2018-0136437 dated Apr. 22, 2020, with its English translation, 19 pages.
  • Second Office Action for Korean Application No. 10-2018-0136437 dated Jul. 1, 2020, with its English translation, 8 pages.
  • Decision to Grant a Patent for Korean Application No. 10-2018-0136437 dated Sep. 2, 2020, with its English translation, 2 pages.
  • Gray et al., Deploying Deep Neural Networks with NVIDIA TensorRT, nVIDIA, Apr. 2, 2017, 12 pages.
  • Prasana et al., TensorRT 3: Faster TensorFlow Inference and Volta Support., nVIDIA, Dec. 4, 2017, 16 pages.
  • TensorRT User Guide—nVIDIA. DU-08540-021_v01. Jul. 2017, 31 pages.

Чертежи

Галерея графических материалов, полученных по документу.

Чертеж 1
Файл: /media/National/US/B2/2022/02/22/0011257008/00000002.tif/png
Размер: 118x185
Чертеж 2
Файл: /media/National/US/B2/2022/02/22/0011257008/00000003.tif/png
Размер: 115x191
Чертеж 3
Файл: /media/National/US/B2/2022/02/22/0011257008/00000004.tif/png
Размер: 143x108
Чертеж 4
Файл: /media/National/US/B2/2022/02/22/0011257008/00000005.tif/png
Размер: 148x178
Чертеж 5
Файл: /media/National/US/B2/2022/02/22/0011257008/00000006.tif/png
Размер: 143x186
Чертеж 6
Файл: /media/National/US/B2/2022/02/22/0011257008/00000007.tif/png
Размер: 136x190
Чертеж 7
Файл: /media/National/US/B2/2022/02/22/0011257008/00000008.tif/png
Размер: 136x186
Чертеж 8
Файл: /media/National/US/B2/2022/02/22/0011257008/00000009.tif/png
Размер: 107x184
Чертеж 9
Файл: /media/National/US/B2/2022/02/22/0011257008/00000010.tif/png
Размер: 96x113
Чертеж 10
Файл: /media/National/US/B2/2022/02/22/0011257008/00000011.tif/png
Размер: 100x96
Чертеж 11
Файл: /media/National/US/B2/2022/02/22/0011257008/00000012.tif/png
Размер: 142x161
Чертеж 12
Файл: /media/National/US/B2/2022/02/22/0011257008/00000013.tif/png
Размер: 70x177
Чертеж 13
Файл: /media/National/US/B2/2022/02/22/0011257008/00000014.tif/png
Размер: 113x192