-
公开(公告)号:GB2575628A
公开(公告)日:2020-01-22
申请号:GB201811197
申请日:2018-07-09
Applicant: NOKIA TECHNOLOGIES OY
Inventor: FRANCESCO CRICRI , ANTTI HALLAPURO , MISKA HANNUKSELA , JANI LAINEMA , EMRE BARIS AKSU , CAGLAR AYTEKIN , RAMIN GHAZNAVI YOUVALARI
IPC: H04N19/50 , G06N3/02 , G06N3/04 , G06N3/08 , H04N19/103 , H04N19/136 , H04N19/14 , H04N19/172 , H04N19/196 , H04N19/85
Abstract: A method comprises: obtaining or receiving video data; providing a current frame and/or one or more previous frames of the obtained or received video data to an input of a neural network (NN); generating a predicted output at an output of the neural network, comprising at least one of one or more predicted future frames of the video data and predicted properties of one or more future frames of the video data; determining one or more processing decisions based, at least in part, on the predicted output; and processing the current frame of the video data at least partially according to the one or more processing decisions. Predicted future frames may encoded for transmission. Processing the current frame may comprise generating residual information based on a difference between a current frame and an earlier prediction of the current frame. The processing decisions may include determining whether to store the current video frame as a reference frame or determining an encoding method for at least the current video frame. Predicted future frames may comprise plural sets of possible future frames.
-
公开(公告)号:GB2572949A
公开(公告)日:2019-10-23
申请号:GB201805973
申请日:2018-04-11
Applicant: NOKIA TECHNOLOGIES OY
Inventor: CAGLAR AYTEKIN , FRANCESCO CRICRI , FAN LIXIN
IPC: G06N3/02
Abstract: An apparatus (e.g. mobile phone or Internet of Things -IoT- enabled device) is receiving layer weight parameters of a trained neural network (NN) and is using a subnetwork part e.g. 32 of the NN. The subnetwork has intermediate hidden layers 24, 25 which correspond to the weights of layers of the trained NN. The subnetwork further comprises an output layer (intermediate output 36). The intermediate output layer may be used as the output of the whole system in the event that the subsequent layers of the NN are missing. In this way a scalable neural network may be maintained even though some neural network data may be missing. The pre-training of the NN may take place on a server and be transmitted to the apparatus in a message sequence. Alternatively, the output layer may be trained on the apparatus from scratch or it may be fine-tuned from a base layer on the server.
-
公开(公告)号:GB2571342A
公开(公告)日:2019-08-28
申请号:GB201803083
申请日:2018-02-26
Applicant: NOKIA TECHNOLOGIES OY
Inventor: CAGLAR AYTEKIN , FRANCESCO CRICRI , EMRE BARIS AKSU
Abstract: A system comprises a controller and a plurality of communication devices 23-27, wherein the controller transmits signalling data to each device over a communications network. The signalling data causes each device to initialise a subset of local layers of a multi-layer artificial neural network architecture, with each device providing a different combination of layers. Assigning a combination of layers to each device forms a distributed neural network. The signalling data may indicate the topology of the layers, and this may define the number of nodes in each layer, the type of each layer, and the weight associated with each layer. The signalling data may also define the processing operations to be performed by the nodes in each layer. The controller may receive the computing resources and capability of each device, and may re-assign layers if an assigned device is incapable of providing a layer. The neural network architecture may be implemented in Internet of Things (IoT) devices. The method, apparatus and computer program of both the controller and one of the devices are claimed.
-
公开(公告)号:GB2557241A
公开(公告)日:2018-06-20
申请号:GB201620422
申请日:2016-12-01
Applicant: NOKIA TECHNOLOGIES OY
Inventor: ANTTI JOHANNES ERONEN , JUSSI ARTTURI LEPPANEN , FRANCESCO CRICRI , ARTO JUHANI LEHTINIEMI
IPC: G10L21/028 , G10L25/57 , H04S7/00
Abstract: In a 3D spatial audio rendering system, a visual image (fig 1A) is analysed to detect corresponding sound objects (fig 1B) whose spatial extent (304) is then modified (eg. increased as in fig 2B) and rendered (404 fig 3) based on the visual analysis (eg. based on the size of the visual object 208, fig 2A). The sound object may be separated into sub-objects to which are applied rules regarding eg. spatial separation of similar frequency bins (as in a set of drums, fig 5).
-
公开(公告)号:GB2557218A
公开(公告)日:2018-06-20
申请号:GB201620325
申请日:2016-11-30
Applicant: NOKIA TECHNOLOGIES OY
Inventor: ANTTI JOHANNES ERONEN , JUSSI ARTTURI LEPPANEN , FRANCESCO CRICRI , ARTO JUHANI LEHTINIEMI
Abstract: Controlling a position/orientation of an audio source 101, 103, 105 within an audio scene, based on a received current physical position/orientation of the audio source relative to a capture device 207 (said capture device comprising a microphone array), a received earlier physical position/orientation of the audio source relative to the capture device 101, 103, 105 and a received control parameter. The controllable position 501, 503, 505 of the audio source is between the current and earlier position/orientations. Preferably the capture device comprises a camera. The main embodiment of the invention involves the panning of individual audio tracks so as to better match a viewed audio scene, e.g. If a guitar is on the left of a stage and a piano is on the right, the respective audio tracks of each instrument would be suitably panned so as to match their positions. However, in some situations the panning of audio tracks to match the perceived positions of the audio sources would result in a sub-optimal mix, while an optimal mix would result in confusion to a listener due to the audio sources not matching the viewed image i.e. having spatial congruence.
-
公开(公告)号:GB2545275A
公开(公告)日:2017-06-14
申请号:GB201521917
申请日:2015-12-11
Applicant: NOKIA TECHNOLOGIES OY
Inventor: JUSSI ARTTURI LEPPANEN , ANTTI JOHANNES ERONEN , ARTO JUHANI LEHTINIEMI , FRANCESCO CRICRI , MIIKKA TAPANI VILERMO
IPC: G06F3/01 , G06F3/0346 , G06F3/14 , G06F3/16
Abstract: Virtual or augmented reality (VR) content is provided to a user via portable equipment located at a first location L1-1 and having a first orientation O1-1, the VR content being associated with a second location L2 and a second orientation O2. The VR content is rendered for provision in dependence on the first location relative to the second location (X1-1) and the first orientation relative to the second orientation (θ1-1). The second location and orientation can be a fixed geographic point or the position of a second portable user equipment for providing a second version of the VR content. In the latter case, if the second user equipment is within the virtual field of view of the first user, content representing the second user is provided to the first user. The VR content may be derived from plural items captured by dedicated devices arranged in a two or three-dimensional array, and may comprise a portion of a cylindrical panorama. The virtual content may comprise audio content with plural sub-components which may appear to come from a single point source if the virtual distance from the user is above a threshold.
-
公开(公告)号:GB2562037A
公开(公告)日:2018-11-07
申请号:GB201706499
申请日:2017-04-25
Applicant: NOKIA TECHNOLOGIES OY
Inventor: KIMMO TAPIO ROIMELA , FRANCESCO CRICRI
IPC: G06T7/11 , G01S13/86 , G06T7/50 , G06T7/521 , H04N13/133
Abstract: A method for three-dimensional scene reconstruction is disclosed, comprising: (a) receiving data from a single or multi-camera device representing a visual image of a space comprising an object, the visual image being captured from a first viewpoint from which a 3D region of interest is associated with the object is determined using the visual image possibly be means of convolutional neural network. A depth map is then generated from a second viewpoint, different from the first viewpoint, by scanning a limited portion of the space, which includes the region of interest, at a predetermined scanning resolution or density possibly using LIDAR. The region of interest may be updated using the depth map or information derived therefrom which may involve reducing its volume.
-
公开(公告)号:GB2556922A
公开(公告)日:2018-06-13
申请号:GB201620008
申请日:2016-11-25
Applicant: NOKIA TECHNOLOGIES OY
Inventor: FRANCESCO CRICRI
Abstract: In response to determining that a portion of location data indicative of a location within a scene of a source (26, figure 1) of a captured audio component is unreliable, determining a region of a captured video component into which to zoom such that the source of the captured audio component is within the determined region. The method may be used when rendering in a virtual reality system, where spatial audio mixing (SAM) is used with positioning detection technology such as High Accuracy Indoor Positioning (HAIP) being used for determining the position of actors or other audio sources in the captured scene. The actors may wear a radio tag which is continuously tracked by an antenna generally co-located with a virtual reality camera (22, figure 1).
-
公开(公告)号:GB2553351A
公开(公告)日:2018-03-07
申请号:GB201615006
申请日:2016-09-05
Applicant: NOKIA TECHNOLOGIES OY
Inventor: JUKKA SAARINEN , FRANCESCO CRICRI
Abstract: A method and apparatus configured to identify a plurality of objects 2.1 in a captured video and/or audio scene, process the scene by removing an object 2.2 from the scene; measuring the effect of removing said object using received data 2.3; reintroducing said object into the scene 2.5; and repeating the said steps for one or more other objects in the scene in turn 2.6. The saliency level is then determined for each object based on the measured effect 2.8. The effect may be measured by the interactions of the remaining objects and the received data may be from sensors of an external observer, possibly using a VR headset.
-
公开(公告)号:GB2543275A
公开(公告)日:2017-04-19
申请号:GB201518023
申请日:2015-10-12
Applicant: NOKIA TECHNOLOGIES OY
Inventor: ANTTI JOHANNES ERONEN , JUSSI ARTTURI LEPPANEN , ARTO JUHANI LEHTINIEMI , SUJEET SHYAMSUNDAR MATE , FRANCESCO CRICRI
Abstract: Apparatus comprising a processor configured to: receive a spatial audio signal associated with a microphone array 113 configured to provide spatial audio capture and at least one additional audio signal associated with an additional microphone 111 such as a Lavalier microphone, the additional audio signal having been delayed by a variable delay determined such that common components of the audio signals are time aligned. A relative position between the microphone array and the additional microphone is determined. The apparatus receives at least one source parameter classifying an audio source associated with the common components and/or at least one space parameter identifying the environment within which the audio source is located. The space and/or source parameters are used to determine at least one processing effect ruleset. At least two output audio channel signals are generated by mixing and applying the at least one processing effect to the spatial audio signal and the at least one additional audio signal.
-
-
-
-
-
-
-
-
-