Welcome to easycore’s documentation!¶
Tutorials¶
Light weight config tools¶
easycore make it easy to load config from local yaml file, save config and control the config in runtime.
Load config from local yaml file¶
An example of yaml file is shown bellow:
MODEL:
IN_FEAUTRES: ["res3", "res4", "res5"]
INPUT_SIZE: (224, 224)
NUM_CLASSES: 100
NAME: YuxinZhaozyx
You can load the yaml file in the follow way:
from easycore.common.config import CfgNode as CN
cfg = CN.open('example.yaml')
# or
with open('example.yaml', 'r', encoding='utf-8') as f:
cfg = CN.open(f)
Get an empty config¶
cfg = CN()
Get a config from from python dict
¶
init_dict = {
"MODEL": {
"IN_FEATURES": ["res3", "res4", "res5"],
"INPUT_SIZE": (224, 224),
"NUM_CLASSES": 100,
},
"NAME": "YuxinZhaozyx",
}
cfg = CN(init_dict)
Use config¶
# get value from config
# the config has been automatically transform into python data type.
in_features = cfg.MODEL.IN_FEATURES # list
input_size = cfg.MODEL.INPUT_SIZE # tuple
num_classes = cfg.MODEL.NUM_CLASSES # int
name = cfg.NAME # str
# add new value to config
cfg.LICENSE = 'MIT'
# add a new CfgNode to config
cfg.SOLVER = CN()
cfg.SOLVER.LEARNING_RATE = 0.001
cfg.SOLVER.BATCH_SIZE = 128
Merge two config¶
cfg_a = CN()
cfg_a.key1 = 1
cfg_a.key2 = 2
cfg_b = CN()
cfg_b.key2 = 3
cfg_c.key3 = 4
# merge two config
cfg_a.merge(cfg_b) # now cfg_a.key2 is 3
Copy a config¶
cfg_copy = cfg.copy() # get a deepcopy of cfg
Save config to yaml file¶
cfg.save("example-save.yaml")
# or
with open("example-save.yaml", 'w', encoding='utf-8') as f:
cfg.save(f)
API Documentation¶
Multiprocessing parallel acceleration tools¶
easycore make it easy to parallel your tasks in cpus and gpus.
API¶
You can write a parallel runner by inheriting class UnorderedRunner
or OrderedRunner
and overriding following 6 static methods.
@staticmethod
def producer_init(device, cfg):
"""
function for producer initialization.
Args:
device (str): device for the this process.
cfg (easycore.common.config.CfgNode): config of this process, you can use it to transfer data
to `producer_work` and `producer_end` function.
"""
pass
@staticmethod
def producer_work(device, cfg, data):
"""
function specify how the producer processes the data.
Args:
device (str): device for this process.
cfg (easycore.common.config.CfgNode): config of this process, you can use it to get data from
`producer_init` function and transfer data to the next `producer_work` and `producer_end`
function.
data (Any): data get from input of `__call__` method.
Returns:
Any: processed data
"""
return data
@staticmethod
def producer_end(device, cfg):
"""
function after finishing all of its task and before close the process.
Args:
device (str): device for this process.
cfg (easycore.common.config.CfgNode): config of this process, you can use it to get data
from `producer_init` and `producer_work` function.
"""
pass
@staticmethod
def consumer_init(cfg):
"""
function for consumer initialization.
Args:
cfg (easycore.common.config.CfgNode): config of this process, you can use it to transfer data
to `consumer_work` and `consumer_end` function.
"""
pass
@staticmethod
def consumer_work(cfg, data):
"""
function specify how the consumer processses the data from producers.
Args:
cfg (easycore.common.config.CfgNode): config of this process, you can use it to get data from
`consumer_init` function and transfer data to the next `consumer_work` and `consumer_end`
function.
"""
pass
@staticmethod
def consumer_end(cfg):
"""
function after receiving all data from producers.
Args:
cfg (easycore.common.config.CfgNode): config of this process, you can use it get data from
`consumer_work` function.
Returns:
Any: processed data
"""
return None
Example 1: Sum of squares¶
It can be implemented with a simple way:
data_list = list(range(100))
result = sum([data * data for data in data_list])
# or more simple
result = 0
for data in data_list:
square = data * data
result += square
We calculate square of each element of the list, and then sum they together. In this case, it can be divided into two tasks. We assign this two tasks to producer and consumer respectively.
from easycore.common.config import CfgNode
from easycore.common.parallel import UnorderedRunner
class Runner(UnorderedRunner):
@staticmethod
def producer_work(device, cfg, data):
return data * data # calculate square of data
@staticmethod
def consumer_init(cfg):
cfg.sum = 0 # init a sum variable with 0, you can use cfg to transfer data
@staticmethod
def consumer_work(cfg, data):
cfg.sum += data # add the square to the sum variable
@staticmethod
def consumer_end(cfg):
return cfg.sum # return the result you need
if __name__ == '__main__':
runner = Runner(devices=3) # if you specify `device with a integer`, it will use cpus.
# You can specify a list of str instead, such as:
# runner = Runner(devices=["cpu", "cpu", "cpu"])
data_list = list(range(100)) # prepare data, it must be iterable
result = runner(data_list) # call the runner
print(result)
runner.close() # close the runner and shutdown all processes it opens.
Example 2: An neural network predictor¶
First we define an neural network in network.py
:
import torch
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc = nn.Linear(1, 3)
def forward(self, x):
x = self.fc(x)
x = F.relu(x)
return x
The network can be paralleled to 4 gpus in the following way:
from easycore.common.config import CfgNode
from easycore.common.parallel import OrderedRunner
from network import Net
import torch
class Predictor(OrderedRunner):
@staticmethod
def producer_init(device, cfg):
cfg.model = Net() # init the producer with a model
cfg.model.to(device) # transfer the model to certain device
@staticmethod
def producer_work(device, cfg, data):
with torch.no_grad():
data = torch.Tensor([[data]]) # preprocess data
data = data.to(device) # transfer data to certain device
output = cfg.model(data) # predict
output = output.cpu() # transfer result to cpu
return output
@staticmethod
def producer_end(device, cfg):
del cfg.model # delete the model when all data has been predicted.
@staticmethod
def consumer_init(cfg):
cfg.data_list = [] # prepare a list to store all data from producers.
@staticmethod
def consumer_work(cfg, data):
cfg.data_list.append(data) # store data from producers.
@staticmethod
def consumer_end(cfg):
data = torch.cat(cfg.data_list, dim=0) # postprocess data.
return data
if __name__ == '__main__':
predictor = Predictor(devices=["cuda:0", "cuda:1", "cuda:2", "cuda:3"]) # init a parallel predictor
data_list = list(range(100)) # prepare data
result = predictor(data_list) # predict
print(result.shape)
predictor.close() # close the predictor when you no longer need it.
Example 3: Process data with batch¶
You can use a simple generator or pytorch dataloader to generate batch data.
from easycore.common.config import CfgNode
from easycore.torch.parallel import OrderedRunner
from network import Net
import torch
def batch_generator(data_list, batch_size):
for i in range(0, len(data_list), batch_size):
data_batch = data_list[i : i+batch_size]
yield data_batch
class Predictor(OrderedRunner):
@staticmethod
def producer_init(device, cfg):
cfg.model = Net()
cfg.model.to(device)
@staticmethod
def producer_work(device, cfg, data):
with torch.no_grad():
data = torch.Tensor(data).view(-1,1)
data = data.to(device)
output = cfg.model(data)
output = output.cpu()
return output
@staticmethod
def producer_end(device, cfg):
del cfg.model
@staticmethod
def consumer_init(cfg):
cfg.data_list = []
@staticmethod
def consumer_work(cfg, data):
cfg.data_list.append(data)
@staticmethod
def consumer_end(cfg):
data = torch.cat(cfg.data_list, dim=0)
return data
if __name__ == '__main__':
predictor = Rredictor(devices=["cuda:0", "cuda:1"])
data_list = list(range(100))
result = predictor(batch_generator(data_list, batch_size=10))
print(result.shape)
predictor.close()
Here, we replace easycore.common.parallel
with easycore.torch.parallel
. easycore.torch.parallel
has the same API with easycore.common.parallel
but use torch.multiprocessing
library instead of multiprocessing
library.
Example 4: Transfer outside parameters into Runner¶
You can transfer parameters into runner through cfg
parameter. cfg
is a easycore.common.config.CfgNode
. See tutorial “Light weight config tools” for how to use it.
We use “sum of power” as an example:
from easycore.common.config import CfgNode as CN
from easycore.common.parallel import UnorderedRunner
class Runner(UnorderedRunner):
@staticmethod
def producer_work(device, cfg, data):
return data ** cfg.exponent # calculate power of data with outside parameter "exponent".
@staticmethod
def consumer_init(cfg):
cfg.sum = 0 # init a sum variable with 0, you can use cfg to transfer data
@staticmethod
def consumer_work(cfg, data):
cfg.sum += data # add the square to the sum variable
@staticmethod
def consumer_end(cfg):
return cfg.sum # return the result you need
if __name__ == '__main__':
# set parameters outside.
cfg = CN()
cfg.exponent = 3
runner = Runner(devices=3, cfg=cfg) # transfer `cfg` into the runner
data_list = list(range(100))
result = runner(data_list)
print(result)
runner.close()
API Documentation¶
Register Mechanism¶
easycore make it easy to register an object with name, and get it later.
Create a registry¶
MODEL_REGISTRY = Registry("MODEL")
Register an object with its __name__
¶
@MODEL_REGISTRY.register()
class ResNet50:
pass
# or
MODEL_REGISTRY.register(obj=ResNet50)
Register an object with a given name¶
@MODEL_REGISTRY.register("resnet")
class RestNet50:
pass
# or
MODEL_REGISTRY.register("resnet", ResNet50)
Get a registered object from registry¶
model_class = MODEL_REGISTRY.get("ResNet50")
# or
model_class = MODEL_REGISTRY.get("resnet")
API Documentation¶
Path manage tools¶
easycore make it easy to manage local path and remote path in the same way.
Manage local path¶
from easycore.common.path import PathManager
# open a local path
with PathManager.open("/path/to/file", 'r', encoding='utf-8') as f:
print(f.read())
# check file or directory exists, similar to `os.path.exists`
PathManager.exists('/path/to/file')
# isfile and isdir, similar to `os.path.isfile` and `os.path.isdir`.
success = PathManager.isfile('/path/to/file')
success = PathManager.isdir('/path/to/dir')
# makedirs, similar to `os.makedirs(path, exist_ok=True)`.
PathManager.makedirs('/path/to/dir')
# remove file (no directory), similar to `os.remove`.
PathManager.remove('/path/to/file')
# remove directoreis, similar to `os.path.removedirs`.
PathManager.removedirs('/path/to/dir')
# list directory, similar to 'os.listdir`.
list_content = PathManager.listdir('/path/to/dir')
# copy
PathManager.copy("/path/to/file2", "/destination/path")
Manage remote URL¶
You can manage remote path (http/https/ftp URL) which may look like http://xxx.com/yyy.txt
.
The remote file will be first downloaded and cached. The cache directory is set by
- environment variable
$EASYCORE_CACHE
, if set. - otherwise,
~/.easycore/cache
.
from easycore.common.path import PathManager
# open a remote path
with PathManager.open("http://xxx.com/yyy.txt", 'r', encoding='utf-8') as f:
print(f.read())
# get local path
local_path = PathManager.get_local_path("http://xxx.com/yyy.txt")
You can copy the file to a local path.
# copy remote file to local path.
PathManager.copy("http://xxx.com/yyy.txt", "/a/local/path")
Redirect path¶
You can redirect a path to anywhere in local or remote.
For example, if you have uploaded a file to a remote server and you can access it throuth URL http://xxx.com/download/yyy/zzz.txt
, PathManager
make it possible to redirect easycore://
prefix to http://xxx.com/download
so that you can access the resource with path easycore://yyy/zzz.txt
.
from easycore.common.path import PathManager, RedirectPathHandler
PathManager.register(RedirectPathHandler("easycore://", "http://xxx.com/download/"))
You can also redirect to a local path.
from easycore.common.path import PathManager, RedirectPathHandler
PathManager.register(RedirectPathHandler("file://", "/path/to/dir/"))
This feature is very useful in redirecting dataset path. For example, my dataset directories are at e:\\Dataset\\MNIST
, e:\\Dataset\\CIFAR100
and f:\\ImageNet
.
from easycore.common.path import PathManager, RedirectPathHandler
PathManager.register(RedirectPathHandler("dataset://MNIST/", "e:\\Dataset\\MNIST\\"))
PathManager.register(RedirectPathHandler("dataset://CIFAR100/", "e:\\Dataset\\CIFAR100\\"))
PathManager.register(RedirectPathHandler("dataset://ImageNet/", "f:\\ImageNet\\"))
Now, I can access them with path dataset://<dataset-name>/
.
Custom PathHandler¶
The behaviors of PathManager
is defined by the registered PathHandler
s.
You can also custom a new PathHandler
and register it to PathManager
.
For example, if you want to redirect the http and https cache directory without setting $EASYCORE_CACHE
, you can custom the HTTPURLHandler
by yourself.
from easycore.common.path import PathManager, HTTPURLHandler
import os
# custom PathHandler
class NewHTTPURLHandler(HTTPURLHandler):
def get_cache_dir(self, protocol):
cache_dir = os.path.expanduser(os.getenv("NEW_CACHE", "~/.easycore/cache"))
if protocol is not None:
cache_dir = os.path.join(cache_dir, protocol)
return cache_dir
def get_support_prefixes(self):
return ["http://", "https://"]
# register custom path handler
PathManager.register(NewHTTPURLHandler(), override=True) # set override to True to override the existing http and https path handler.
Now, you can set cache directory through $NEW_CACHE
.
See the detail implement of NativePathHandler
, HTTPURLHandler
, RedirectPathHandler
for more custom path handler examples.
API Documentation¶
API Documentation¶
easycore.common¶
easycore.common.config¶
-
class
easycore.common.config.
CfgNode
(init_dict: dict = None, copy=True)[source]¶ Bases:
dict
Config Node
-
freeze
(frozen: bool = True)[source]¶ freeze or unfreeze the CfgNode and all of its children
Parameters: frozen (bool) – freeze or unfreeze the config
-
merge
(cfg)[source]¶ merge another CfgNode into this CfgNode, the another CfgNode will override this CfgNode.
Parameters: cfg (CfgNode) –
-
classmethod
open
(file, encoding='utf-8')[source]¶ load a CfgNode from file.
Parameters: Returns: CfgNode
-
classmethod
load
(yaml_str: str)[source]¶ load a CfgNode from a string of yaml format
Parameters: yaml_str (str) – Returns: CfgNode
-
classmethod
dump
(cfg, stream=None, encoding=None, **kwargs)[source]¶ dump CfgNode into yaml str or yaml file
Note
if stream option is set to non-None object, the CfgNode will be dumpped into stream and return None, if stream option is not given or set to None, return a string instead.
Parameters: - cfg (CfgNode) –
- stream (io.IOBase or None) – if set to a file object, the CfgNode will be dumpped into stream and return None, if set to None, return a string instead.
- encoding (str or None) –
- **kwargs –
options of the yaml dumper.
Some useful options: [“allow_unicode”, “line_break”, “explicit_start”, “explicit_end”, “version”, “tags”].
See more details at https://github.com/yaml/pyyaml/blob/2f463cf5b0e98a52bc20e348d1e69761bf263b86/lib3/yaml/__init__.py#L252
Returns: None or str
-
-
class
easycore.common.config.
HierarchicalCfgNode
[source]¶ Bases:
object
Config Node help class for open yaml file that depends on another yaml file.
You can specify the dependency between yaml files with
__BASE__
tag.Example
We can load yaml file
example-A.yaml
which depends onexample-B.yaml
in the following way.example-A.yaml
:__BASE__: ./example-B.yaml A: in example-A.yaml C: in example-A.yaml
example-B.yaml
:A: in example-B.yaml B: in example-B.yaml
Now, you can open example-A.yaml:
>>> import easycore.common.config import HierarchicalCfgNode >>> cfg = HierarchicalCfgNode.open("./example-A.yaml") >>> print(cfg) {"A" : "in example-A.yaml", "B" : "in example-B.yaml", "C" : "in example-A.yaml"}
Attributes in
example-A.yaml
will cover attributes inexample-B.yaml
.Note
__BASE__
can be an absolute path or a path relative to the yaml file. And it will be first considered as a path relative to the yaml file then an absolute path.-
classmethod
open
(file, encoding='utf-8')[source]¶ load a CfgNode from file.
Parameters: Returns: CfgNode
-
classmethod
easycore.common.parallel¶
-
class
easycore.common.parallel.
BaseRunner
(devices, cfg={}, queue_scale=3.0)[source]¶ Bases:
object
A Multi-process runner whose consumer receive data in unorder. The runner will start multi-processes for producers and 1 thread for consumer.
-
__init__
(devices, cfg={}, queue_scale=3.0)[source]¶ Parameters: - devices (int or Iterable) – If the devices is int, it will use devices cpu to do the work. If the devices is an iterable object, such as list, it will use the devices specified by the iterable object, such as [“cpu”, “cuda:0”, “cuda:1”].
- cfg (easycore.common.config.CfgNode) – user custom data.
- queue_scale (float) – scale the queues for communication between processes.
-
is_activate
¶ whether the runner is alive.
-
static
producer_init
(device, cfg)[source]¶ function for producer initialization.
Parameters: - device (str) – device for the this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to transfer data to producer_work and producer_end function.
-
static
producer_work
(device, cfg, data)[source]¶ function specify how the producer processes the data.
Parameters: - device (str) – device for this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from producer_init function and transfer data to the next producer_work and producer_end function.
- data (Any) – data get from input of __call__ method.
Returns: Any – processed data
-
static
producer_end
(device, cfg)[source]¶ function after finishing all of its task and before close the process.
Parameters: - device (str) – device for this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from producer_init and producer_work function.
-
static
consumer_init
(cfg)[source]¶ function for consumer initialization.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it to transfer data to consumer_work and consumer_end function.
-
static
consumer_work
(cfg, data)[source]¶ function specify how the consumer processses the data from producers.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from consumer_init function and transfer data to the next consumer_work and consumer_end function.
-
static
consumer_end
(cfg)[source]¶ function after receiving all data from producers.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it get data from consumer_work function. Returns: Any – processed data
-
-
class
easycore.common.parallel.
UnorderedRunner
(devices, cfg={}, queue_scale=3.0)[source]¶ Bases:
easycore.common.parallel.engine.BaseRunner
A Multi-process runner whose consumer receive data in unorder. The runner will start multi-processes for producers and 1 thread for consumer.
-
__init__
(devices, cfg={}, queue_scale=3.0)[source]¶ Parameters: - devices (int or Iterable) – If the devices is int, it will use devices cpu to do the work. If the devices is an iterable object, such as list, it will use the devices specified by the iterable object, such as [“cpu”, “cuda:0”, “cuda:1”].
- cfg (easycore.common.config.CfgNode) – user custom data.
- queue_scale (float) – scale the queues for communication between processes.
-
__call__
(data_iter)¶ Parameters: data_iter (Iterable) – iterator of data Returns: Any – result
-
activate
()¶ Restart all processes if this runner is closed.
-
close
()¶ Shutdown all processes if this runner is alive.
-
static
consumer_end
(cfg)¶ function after receiving all data from producers.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it get data from consumer_work function. Returns: Any – processed data
-
static
consumer_init
(cfg)¶ function for consumer initialization.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it to transfer data to consumer_work and consumer_end function.
-
static
consumer_work
(cfg, data)¶ function specify how the consumer processses the data from producers.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from consumer_init function and transfer data to the next consumer_work and consumer_end function.
-
is_activate
¶ whether the runner is alive.
-
static
producer_end
(device, cfg)¶ function after finishing all of its task and before close the process.
Parameters: - device (str) – device for this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from producer_init and producer_work function.
-
static
producer_init
(device, cfg)¶ function for producer initialization.
Parameters: - device (str) – device for the this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to transfer data to producer_work and producer_end function.
-
static
producer_work
(device, cfg, data)¶ function specify how the producer processes the data.
Parameters: - device (str) – device for this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from producer_init function and transfer data to the next producer_work and producer_end function.
- data (Any) – data get from input of __call__ method.
Returns: Any – processed data
-
-
class
easycore.common.parallel.
OrderedRunner
(devices, cfg={}, queue_scale=3.0)[source]¶ Bases:
easycore.common.parallel.engine.BaseRunner
A Multi-process runner whose consumer receive data in order. The runner will start multi-processes for producers and 1 thread for consumer.
-
__init__
(devices, cfg={}, queue_scale=3.0)[source]¶ Parameters: - devices (int or Iterable) – If the devices is int, it will use devices cpu to do the work. If the devices is an iterable object, such as list, it will use the devices specified by the iterable object, such as [“cpu”, “cuda:0”, “cuda:1”].
- cfg (easycore.common.config.CfgNode) – user custom data.
- queue_scale (float) – scale the queues for communication between processes.
-
__call__
(data_iter)¶ Parameters: data_iter (Iterable) – iterator of data Returns: Any – result
-
static
consumer_end
(cfg)¶ function after receiving all data from producers.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it get data from consumer_work function. Returns: Any – processed data
-
static
consumer_init
(cfg)¶ function for consumer initialization.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it to transfer data to consumer_work and consumer_end function.
-
static
consumer_work
(cfg, data)¶ function specify how the consumer processses the data from producers.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from consumer_init function and transfer data to the next consumer_work and consumer_end function.
-
is_activate
¶ whether the runner is alive.
-
static
producer_end
(device, cfg)¶ function after finishing all of its task and before close the process.
Parameters: - device (str) – device for this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from producer_init and producer_work function.
-
static
producer_init
(device, cfg)¶ function for producer initialization.
Parameters: - device (str) – device for the this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to transfer data to producer_work and producer_end function.
-
static
producer_work
(device, cfg, data)¶ function specify how the producer processes the data.
Parameters: - device (str) – device for this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from producer_init function and transfer data to the next producer_work and producer_end function.
- data (Any) – data get from input of __call__ method.
Returns: Any – processed data
-
easycore.common.registry¶
-
class
easycore.common.registry.
Registry
(name: str)[source]¶ Bases:
object
The registry that provides name -> object mapping.
To create a registry:
MODEL_REGISTRY = Registry("MODEL")
To register an object with its
__name__
:@MODEL_REGISTRY.register() class ResNet50: pass # or MODEL_REGISTRY.register(obj=ResNet50)
To register an object with a given name:
@MODEL_REGISTRY.register("resnet") class RestNet50: pass # or MODEL_REGISTRY.register("resnet", ResNet50)
To get a registered object from registry:
model_class = MODEL_REGISTRY.get("ResNet50") # or model_class = MODEL_REGISTRY.get("resnet")
-
register
(name: str = None, obj: object = None) → Optional[object][source]¶ Register the given object with given name. If the object is not given, it will act as a decorator.
Parameters: Returns: Optional[object] – None or a decorator.
-
unregister
(name: str) → None[source]¶ Remove registered object.
Parameters: name (str) – registered name
-
is_registered
(name)[source]¶ Get whether the given name has been registered.
Parameters: name (str) – Returns: bool – whether the name has been registered.
-
easycore.common.network¶
-
easycore.common.network.
download_file
(url: str, dir: str, filename: Optional[str] = None, progress: bool = True) → str[source]¶ Download a file from a given URL to a directory. If the file exists, will not overwrite the existing file.
Parameters: Returns: str – the path to the downloaded file or the existing one.
easycore.common.path¶
-
class
easycore.common.path.
PathManager
[source]¶ Bases:
object
A general path manager for URI.
-
static
register
(handler: easycore.common.path.path_handler.PathHandler, override: bool = False) → None[source]¶ Register a path handler.
Parameters: - handler (PathHandler) –
- override (bool) – allow overriding existing handler for prefix.
-
static
open
(path: str, mode: str = 'r', **kwargs)[source]¶ Open a stream to a URI, similar to the built-in open.
Parameters: Returns: IO – a file-like object.
-
static
copy
(src_path: str, dst_path: str, overwrite: bool = False) → bool[source]¶ Copy a source path to a destination path.
Parameters: Returns: bool – True on success.
-
static
copy_from_local
(local_path: str, dst_path: str, overwrite: bool = False) → bool[source]¶ Copy a resource form local path to destination path.
Note
This interface is for custom PathHandler, it is prefered to use copy() instead.
Parameters: Returns: bool – True on success.
-
static
get_local_path
(path: str) → str[source]¶ Get a file path which is compatible with native Python I/O such as open and os.path.
Note
If URI points to a remote resource, this function may download and cache the resource to local disk.
Parameters: path (str) – A URI supported by registered PathHandler. Returns: str – a file path which exists on the local file system.
-
static
exists
(path: str) → bool[source]¶ Checks if there is a resource at the given URI.
Parameters: path (str) – A URI supported by registered PathHandler. Returns: bool – True if the path exists.
-
static
isfile
(path: str) → bool[source]¶ Checks if there the resource at the given URI is a file.
Parameters: path (str) – A URI supported by registered PathHandler. Returns: bool – True if the path is a file.
-
static
isdir
(path: str) → bool[source]¶ Checks if the resource at the given URI is a directory.
Parameters: path (str) – A URI supported by this PathHandler. Returns: bool – True if the path is a directory.
-
static
listdir
(path: str) → List[str][source]¶ List the contents of the directory at the given URI.
Parameters: path (str) – A URI supported by registered PathHandler. Returns: List[str] – list of contents in the given path.
-
static
makedirs
(path: str) → None[source]¶ Recursive directory creation function. Similar to os.makedirs().
Parameters: path (str) – A URI supported by registered PathHandler.
-
static
-
class
easycore.common.path.
PathHandler
[source]¶ Bases:
object
Base Path handler class for a URI. It routes I/O for a generic URI which may look like “protocol://path/to/file”.
-
get_cache_dir
(protocol: Optional[str] = None) → str[source]¶ Return a cache directory like <base-cache-dir>/protocol.
The <base-cache-dir> is
- $EASYCORE_CACHE, if set
- otherwise ~/.easycore/cache
Parameters: protocol (str or None) – protocol such as ‘http’, ‘https’. If None, returns the base cache dir.
-
get_supported_prefixes
() → List[str][source]¶ Returns: List[str] – the list of URI prefixes the PathHandler can support.
-
get_local_path
(path: str) → str[source]¶ Get a file path which is compatible with native Python I/O such as open and os.path.
Parameters: path (str) – A URI supported by this PathHandler. Returns: local_path (str) – a file path which exists on the local file system.
-
open
(path: str, mode: str = 'r', **kwargs)[source]¶ Open a stream to a URI, similar to the built-in open.
Parameters: Returns: IO – a file-like object.
-
exists
(path: str) → bool[source]¶ Checks if there is a resource at the given URI.
Parameters: path (str) – A URI supported by this PathHandler. Returns: bool – True if the path exists.
-
isfile
(path: str) → bool[source]¶ Checks if the resource at the given URI is a file.
Parameters: path (str) – A URI supported by this PathHandler. Returns: bool – True if the path is a file.
-
isdir
(path: str) → bool[source]¶ Checks if the resource at the given URI is a directory.
Parameters: path (str) – A URI supported by this PathHandler. Returns: bool – True if the path is a file.
-
listdir
(path: str) → bool[source]¶ List the contents of the directory at the given URI.
Parameters: path (str) – A URI supported by the PathHandler. Returns: List[str] – list of contents in given path.
-
makedirs
(path: str) → None[source]¶ Recursive directory creation function. Similar to os.makedirs
Parameters: path (str) – A URI supported by this PathHandler.
-
remove
(path: str) → None[source]¶ Remove the file (not directory) at the given URI.
Parameters: path (str) – A URI supported by this PathHandler.
-
removedirs
(path: str) → None[source]¶ Remove directories recursively.
Parameters: path (str) – A URI supported by this PathHandler.
-
-
class
easycore.common.path.
NativePathHandler
[source]¶ Bases:
easycore.common.path.path_handler.PathHandler
PathHandler for local path.
-
get_local_path
(path: str) → str[source]¶ Get a file path which is compatible with native Python I/O such as open and os.path.
Parameters: path (str) – A URI supported by this PathHandler. Returns: local_path (str) – a file path which exists on the local file system.
-
open
(path: str, mode: str = 'r', **kwargs)[source]¶ Open a stream to a URI, similar to the built-in open.
Parameters: Returns: IO – a file-like object.
-
exists
(path: str) → bool[source]¶ Checks if there is a resource at the given URI.
Parameters: path (str) – A URI supported by this PathHandler. Returns: bool – True if the path exists.
-
isfile
(path: str) → bool[source]¶ Checks if the resource at the given URI is a file.
Parameters: path (str) – A URI supported by this PathHandler. Returns: bool – True if the path is a file.
-
isdir
(path: str) → bool[source]¶ Checks if the resource at the given URI is a directory.
Parameters: path (str) – A URI supported by this PathHandler. Returns: bool – True if the path is a file.
-
listdir
(path: str) → bool[source]¶ List the contents of the directory at the given URI.
Parameters: path (str) – A URI supported by the PathHandler. Returns: List[str] – list of contents in given path.
-
makedirs
(path: str) → None[source]¶ Recursive directory creation function. Similar to os.makedirs
Parameters: path (str) – A URI supported by this PathHandler.
-
remove
(path: str) → None[source]¶ Remove the file (not directory) at the given URI.
Parameters: path (str) – A URI supported by this PathHandler.
-
removedirs
(path: str) → None[source]¶ Remove directories recursively.
Parameters: path (str) – A URI supported by this PathHandler.
-
copy_from_local
(local_path: str, dst_path: str, overwrite: bool = False) → None[source]¶ Copy a local file to the given URI.
Parameters: Returns: bool – True on success.
-
get_cache_dir
(protocol: Optional[str] = None) → str¶ Return a cache directory like <base-cache-dir>/protocol.
The <base-cache-dir> is
- $EASYCORE_CACHE, if set
- otherwise ~/.easycore/cache
Parameters: protocol (str or None) – protocol such as ‘http’, ‘https’. If None, returns the base cache dir.
-
get_supported_prefixes
() → List[str]¶ Returns: List[str] – the list of URI prefixes the PathHandler can support.
-
-
class
easycore.common.path.
HTTPURLHandler
[source]¶ Bases:
easycore.common.path.path_handler.PathHandler
Download URLs and cache them to disk.
-
get_supported_prefixes
() → List[str][source]¶ Returns: List[str] – the list of URI prefixes the PathHandler can support.
-
get_local_path
(path: str) → str[source]¶ Get a file path which is compatible with native Python I/O such as open and os.path.
Parameters: path (str) – A URI supported by this PathHandler. Returns: local_path (str) – a file path which exists on the local file system.
-
open
(path: str, mode: str = 'r', **kwargs)[source]¶ Open a stream to a URI, similar to the built-in open.
Parameters: Returns: IO – a file-like object.
-
copy_from_local
(local_path: str, dst_path: str, overwrite: bool = False) → None¶ Copy a local file to the given URI.
Parameters: Returns: bool – True on success.
-
exists
(path: str) → bool¶ Checks if there is a resource at the given URI.
Parameters: path (str) – A URI supported by this PathHandler. Returns: bool – True if the path exists.
-
get_cache_dir
(protocol: Optional[str] = None) → str¶ Return a cache directory like <base-cache-dir>/protocol.
The <base-cache-dir> is
- $EASYCORE_CACHE, if set
- otherwise ~/.easycore/cache
Parameters: protocol (str or None) – protocol such as ‘http’, ‘https’. If None, returns the base cache dir.
-
isdir
(path: str) → bool¶ Checks if the resource at the given URI is a directory.
Parameters: path (str) – A URI supported by this PathHandler. Returns: bool – True if the path is a file.
-
isfile
(path: str) → bool¶ Checks if the resource at the given URI is a file.
Parameters: path (str) – A URI supported by this PathHandler. Returns: bool – True if the path is a file.
-
listdir
(path: str) → bool¶ List the contents of the directory at the given URI.
Parameters: path (str) – A URI supported by the PathHandler. Returns: List[str] – list of contents in given path.
-
makedirs
(path: str) → None¶ Recursive directory creation function. Similar to os.makedirs
Parameters: path (str) – A URI supported by this PathHandler.
-
-
class
easycore.common.path.
RedirectPathHandler
(new_prefix: str, old_prefix: str)[source]¶ Bases:
easycore.common.path.path_handler.PathHandler
Redirect a new prefix to existed prefix.
Example
PathManager.register(RedirectPathHandler("easycore://", "http://xxx.com/download/"))
-
get_supported_prefixes
() → List[str][source]¶ Returns: List[str] – the list of URI prefixes the PathHandler can support.
-
redirect
(path: str) → str[source]¶ Redirect path from new_prefix to old_prefix path.
Parameters: path (str) – path of new_prefix. Returns: str – path of old_prefix.
-
get_local_path
(path: str) → str[source]¶ Get a file path which is compatible with native Python I/O such as open and os.path.
Parameters: path (str) – A URI supported by this PathHandler. Returns: local_path (str) – a file path which exists on the local file system.
-
open
(path: str, mode: str = 'r', **kwargs)[source]¶ Open a stream to a URI, similar to the built-in open.
Parameters: Returns: IO – a file-like object.
-
exists
(path: str) → bool[source]¶ Checks if there is a resource at the given URI.
Parameters: path (str) – A URI supported by this PathHandler. Returns: bool – True if the path exists.
-
isfile
(path: str) → bool[source]¶ Checks if the resource at the given URI is a file.
Parameters: path (str) – A URI supported by this PathHandler. Returns: bool – True if the path is a file.
-
isdir
(path: str) → bool[source]¶ Checks if the resource at the given URI is a directory.
Parameters: path (str) – A URI supported by this PathHandler. Returns: bool – True if the path is a file.
-
listdir
(path: str) → bool[source]¶ List the contents of the directory at the given URI.
Parameters: path (str) – A URI supported by the PathHandler. Returns: List[str] – list of contents in given path.
-
makedirs
(path: str) → None[source]¶ Recursive directory creation function. Similar to os.makedirs
Parameters: path (str) – A URI supported by this PathHandler.
-
remove
(path: str) → None[source]¶ Remove the file (not directory) at the given URI.
Parameters: path (str) – A URI supported by this PathHandler.
-
removedirs
(path: str) → None[source]¶ Remove directories recursively.
Parameters: path (str) – A URI supported by this PathHandler.
-
-
easycore.common.path.
file_lock
(path: str)[source]¶ A file lock. Once entered, it is guaranteed that no one else holds the same lock. Others trying to enter the lock will block for 30 minutes and raise an exception.
This is useful to make sure workers don’t cache files to the same location.
Parameters: path (str) – a path to be locked. This function will create a lock named path + “.lock”. Examples
>>> filename = "/path/to/file" >>> with file_lock(filename): >>> if not os.path.isfile(filename): >>> do_create_file()
easycore.torch¶
easycore.torch.parallel¶
-
class
easycore.torch.parallel.
BaseRunner
(devices, cfg={}, queue_scale=3.0)[source]¶ Bases:
object
A Multi-process runner whose consumer receive data in unorder. The runner will start multi-processes for producers and 1 thread for consumer.
-
__init__
(devices, cfg={}, queue_scale=3.0)[source]¶ Parameters: - devices (int or Iterable) – If the devices is int, it will use devices cpu to do the work. If the devices is an iterable object, such as list, it will use the devices specified by the iterable object, such as [“cpu”, “cuda:0”, “cuda:1”].
- cfg (easycore.common.config.CfgNode) – user custom data.
- queue_scale (float) – scale the queues for communication between processes.
-
is_activate
¶ whether the runner is alive.
-
static
producer_init
(device, cfg)[source]¶ function for producer initialization.
Parameters: - device (str) – device for the this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to transfer data to producer_work and producer_end function.
-
static
producer_work
(device, cfg, data)[source]¶ function specify how the producer processes the data.
Parameters: - device (str) – device for this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from producer_init function and transfer data to the next producer_work and producer_end function.
- data (Any) – data get from input of __call__ method.
Returns: Any – processed data
-
static
producer_end
(device, cfg)[source]¶ function after finishing all of its task and before close the process.
Parameters: - device (str) – device for this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from producer_init and producer_work function.
-
static
consumer_init
(cfg)[source]¶ function for consumer initialization.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it to transfer data to consumer_work and consumer_end function.
-
static
consumer_work
(cfg, data)[source]¶ function specify how the consumer processses the data from producers.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from consumer_init function and transfer data to the next consumer_work and consumer_end function.
-
static
consumer_end
(cfg)[source]¶ function after receiving all data from producers.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it get data from consumer_work function. Returns: Any – processed data
-
-
class
easycore.torch.parallel.
UnorderedRunner
(devices, cfg={}, queue_scale=3.0)[source]¶ Bases:
easycore.torch.parallel.engine.BaseRunner
A Multi-process runner whose consumer receive data in unorder. The runner will start multi-processes for producers and 1 thread for consumer.
-
__init__
(devices, cfg={}, queue_scale=3.0)[source]¶ Parameters: - devices (int or Iterable) – If the devices is int, it will use devices cpu to do the work. If the devices is an iterable object, such as list, it will use the devices specified by the iterable object, such as [“cpu”, “cuda:0”, “cuda:1”].
- cfg (easycore.common.config.CfgNode) – user custom data.
- queue_scale (float) – scale the queues for communication between processes.
-
__call__
(data_iter)¶ Parameters: data_iter (Iterable) – iterator of data Returns: Any – result
-
activate
()¶ Restart all processes if this runner is closed.
-
close
()¶ Shutdown all processes if this runner is alive.
-
static
consumer_end
(cfg)¶ function after receiving all data from producers.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it get data from consumer_work function. Returns: Any – processed data
-
static
consumer_init
(cfg)¶ function for consumer initialization.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it to transfer data to consumer_work and consumer_end function.
-
static
consumer_work
(cfg, data)¶ function specify how the consumer processses the data from producers.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from consumer_init function and transfer data to the next consumer_work and consumer_end function.
-
is_activate
¶ whether the runner is alive.
-
static
producer_end
(device, cfg)¶ function after finishing all of its task and before close the process.
Parameters: - device (str) – device for this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from producer_init and producer_work function.
-
static
producer_init
(device, cfg)¶ function for producer initialization.
Parameters: - device (str) – device for the this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to transfer data to producer_work and producer_end function.
-
static
producer_work
(device, cfg, data)¶ function specify how the producer processes the data.
Parameters: - device (str) – device for this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from producer_init function and transfer data to the next producer_work and producer_end function.
- data (Any) – data get from input of __call__ method.
Returns: Any – processed data
-
-
class
easycore.torch.parallel.
OrderedRunner
(devices, cfg={}, queue_scale=3.0)[source]¶ Bases:
easycore.torch.parallel.engine.BaseRunner
A Multi-process runner whose consumer receive data in order. The runner will start multi-processes for producers and 1 thread for consumer.
-
__init__
(devices, cfg={}, queue_scale=3.0)[source]¶ Parameters: - devices (int or Iterable) – If the devices is int, it will use devices cpu to do the work. If the devices is an iterable object, such as list, it will use the devices specified by the iterable object, such as [“cpu”, “cuda:0”, “cuda:1”].
- cfg (easycore.common.config.CfgNode) – user custom data.
- queue_scale (float) – scale the queues for communication between processes.
-
__call__
(data_iter)¶ Parameters: data_iter (Iterable) – iterator of data Returns: Any – result
-
static
consumer_end
(cfg)¶ function after receiving all data from producers.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it get data from consumer_work function. Returns: Any – processed data
-
static
consumer_init
(cfg)¶ function for consumer initialization.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it to transfer data to consumer_work and consumer_end function.
-
static
consumer_work
(cfg, data)¶ function specify how the consumer processses the data from producers.
Parameters: cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from consumer_init function and transfer data to the next consumer_work and consumer_end function.
-
is_activate
¶ whether the runner is alive.
-
static
producer_end
(device, cfg)¶ function after finishing all of its task and before close the process.
Parameters: - device (str) – device for this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from producer_init and producer_work function.
-
static
producer_init
(device, cfg)¶ function for producer initialization.
Parameters: - device (str) – device for the this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to transfer data to producer_work and producer_end function.
-
static
producer_work
(device, cfg, data)¶ function specify how the producer processes the data.
Parameters: - device (str) – device for this process.
- cfg (easycore.common.config.CfgNode) – config of this process, you can use it to get data from producer_init function and transfer data to the next producer_work and producer_end function.
- data (Any) – data get from input of __call__ method.
Returns: Any – processed data
-